


oT) 


\G . _ 


Vol. 42 JANUARY, 1951 No. 1 


The Journal of Educational 
Psychology 


UN Widdred PF Yarily to the Scientific Study of Problems of Learning and Teaching 








OF MICHI 
FFB 21 1991 
PERIODICAL CONTENTS 
READING ROOM 
Frequent Testing as a Motivating Factor in Large Lecture Classes. . 1 


MILDRED L. FITCH, A. J. DRUCKER, AND J. A. NORTON, JR. 


A Validity Study of Personality Questionnaires at the Upper Elemen- 
tary GradeLevel. . . . . [are eae Pe OAS 
WILLIAM J. LODGE 


Academic Achievements of Veterans and Non-Veterans at the City 


Dee ae ape Oe... wk 8 tw eee ew a eae 
LOUIS LAURO AND JAMES D. PERRY 


Readability of Children’s Textbooks. . ..... . vemdd teed ecu 
EDMUND W. J. FAISON 


Selection of Students for a Trade and Industrial Education Curric- 
ahi: rc a ihe Vere ee se 
H. S. BELMAN AND R. N. EVANS 


“Sere oe (ee ere 


$6.00 per Year - Published Monthly October to May 


WARWICK & YORK, INC. 
BALTIMORE 2, MD. 


Entered as Second Class Matter Nov. 15, 1921, at the Post Office at Baltimore, Md. 
under the Act of March 3, 1879; additional entry as Second Class Matter at York, Pa. 











THE JOURNAL OF 
Educational Psychology 


Established 1910 


EDITORIAL BOARD 


STEPHEN M. Corey . B. Stroup 
Teachers College, Columbia University State University of lowa 
—Learning— —Individual Differences— 
Jack W. Duntap PercivaL M. Symonps 
to E. 49th St., New York 17, N.Y. Teachers College, Columbia University 
—Technical, Statistics— —Mental Hygiene— 
Karu J. Ho.izincer Mires A. TINKER 
University of Chicago University of Minnesota 
—Factor Analysis— — Reading— 
Haro.tp E. Jones ALEXANDER G. WESMAN 
University of California The Psychological Corporation 
—Soctal Behavior, Child Psychology— —Tests— 
H. H. RemMMeErRs Paut A. Witty 
Purdue University Northwestern University 
—Attutudes, Teacher Evaluation— —Children’s Interests— 
H. E. Bucuuouz 
my Editor 
10 E. Centre St., Baltimore 2, Md. 


6 Baw Journat oF EpucationaL Psycuo.ocy is devoted pri- 
marily to the scientific study of problems of learning, teaching, 
and measurement of the psychological development of the indi- 
vidual. The JourNnat will contain articles on the following sub- 
jects: the psychology of school subjects; experimental studies of 
learning; the development of interests, attitudes, and personality, 
particularly as related to school adjustment; emotion, motivation, 
and character; mental development and methods. This last will 
include tests, statistical techniques, and research techniques in 
cross-sectional and developmental studies. 


Manuscripts may be submitted to any member of the Editorial 
Board, but the handling of an article will be facilitated if it is sent 
to that member of the Board who is designated as particularly 
interested in the phase of psychology dealt with. (Such designations 
appear after the Editors’ names in the list given above.) 

Books and other materials for review and correspondence 
regarding editorial matters should be addressed to The Journal 
of Educational Psychology, Warwick and York, Inc., Publishers, 
10 E. Centre St., Baltimore 2, Md. 


Manuscripts should be typed and double-spaced throughout, 
including quotations, footnotes, and references. In order to attain 


(Continued on Inside Back Cover) 

















THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 








Volume 42 January, 1951 Number 1 








FREQUENT TESTING AS A MOTIVATING FACTOR 
IN LARGE LECTURE CLASSES 


MILDRED L. FITCH, A. J. DRUCKER, and J. A. NORTON, JR. 
Purdue University 


Any instructional procedure is useful which stimulates the 
interest of the student and produces desirable activity leading 
to educational accomplishment and growth. Although the lec- 
ture method, as used in colleges and universities, has not been 
shown to be an ineffective method of instruction, even with large 
groups,'* many instructors, especially in the social sciences, like 
to supplement lectures with recitation or student-instructor dis- 
cussions or avoid lectures as much as possible. Large enroll- 
ments in basic college courses pretty much preclude discus- 
sion or recitation methods whose aim is extensive student 
participation. 

This study deals with the investigation of two interrelated 
instructional devices employed to improve student achievement 
in a lecture course in Government, where classes number from one 
hundred to two hundred. To the regular three one-hour lecture 
sessions are added several hour-long discussion sessions, attend- 
ance at any of which is optional on the part of the student. The 
other device is a program of weekly quizzes over course material, 
where results of such testing are not used in determining the 
course grade. Frequent measurement is expected to result in 
steadier application of the individual to the task at hand, 
this instructional function of measurement being best served 
when divorced from the regular process of achievement 
evaluation. 

There is some objective evidence that frequent testing in the 
classroom results in better achievement.‘ '? Knowledge of 
results, a probable outcome of frequent testing, has also been 
shown to aid learning.!® 


1 








2 The Journal of Educational Psychology 


EXPERIMENTAL PROCEDURE 


The current study was undertaken to explore further the 
effects of frequent testing upon motivation of college students 
to achieve and the students’ outside endeavor with respect to a 
particular course content. The achievement of two large classes 
of students in Government 10, a required course in the School of 
Science at Purdue University, were used to determine the benefits 
to learning gained when (a) short quizzes for the purpose of guid- 
ing the student’s own achievement were given in one class over 
the weekly assignments, and (b) voluntary discussion groups were 
provided for both classes each week in addition to the three reg- 
ular class periods. The control group was designated as the class 
of ninety-seven students to whom only the regular monthly 
quiz was given, and the experimental group consisted of the other 
class of one hundred ninety-eight students to whom the weekly 
quiz was administered in addition to the regular monthly quiz. 
Both classes met at 9 o’clock on alternate days three times a week. 
The senior author was the lecturer for the course. Assignment 
of individuals to these classes was not believed to be related to 
factors of sex, age, ability or curriculum. No knowledge of the 
study was given to the students. 

The lecture method was used over identical material for both 
classes and the same reading assignments were made. The last 
half hour of the third class meeting each week was open for dis- 
cussion or questions from the floor, primarily over the reading 
assignments. This was followed by a ten-minute quiz in the 
experimental section only, and this quiz was confined to the 
reading assignments in the texts and not to the supplementary 
material covered in the lectures. Thus both classes had identical 
lectures and reading assignments and opportunity to raise ques- 
tions for discussion in one regular class period. 

Four one-hour tests were administered to both groups during 
the semester over the work covered each month. A fifth test, 
which reviewed topics covered during the semester, was also 
given at the end of the course. The first, fourth and fifth tests 
were objective—true-false, multiple choice, matching or short 
answer—and the second and third were essay-type tests. 

The three objective tests were identical for both classes. The 
two essay tests were composed of ‘thought’ questions given to 








Frequent Testing as a Motivating Factor 3 


students of both classes before the day of testing. Every effort 
was made to make the essay tests comparable. All tests were 
administered to the experimental section before the control sec- 
tion; hence it is believed that advance knowledge of the objective 
test questions obtained before the test would benefit only the 
control section. 

Six voluntary discussion sections, set up at different hours 
during the middle of the week at times which would best accom- 
modate the students, were handled by two graduate students. * 
These sections were arranged as an outlet for discussion for stu- 
dents who felt they would learn more if they had the opportunity 
for more personal contact with an instructor than large lecture 
sections could provide. It was evident, however, that attend- 
ance at the discussion sections was increased when topics for 
monthly test review could be discussed, although this was not 
the original intention. A poll of the sections showed that the 
use of the discussion groups on a voluntary basis was favored by a 
substantial majority of both groups. Weekly quizzes were also 
favored by both groups, the control people having become famil- 
iar with weekly quizzes through their use in Government 9 the 
previous semester. As a matter of fact, when the weekly quizzes 
were not resumed at the start of the semester for the control 
section, some complaints were registered by students who had 
favored that technique in Government 9 as a device to encourage 
them to read their weekly assignments. 


ANALYSIS AND RESULTST 


The Variables.—The criterion variable for evaluating the exper- 
imental results is based on the grades assigned in the five one-hour 
tests given in both experimental and control sections, and graded 
jointly for both sections. The grades assigned on each test were 
based on the Purdue University number grading system, with 
values one through six, but greater differentiation was available 
through the use of plusses and minusses appended to the number 
grade. These grades were assigned on each test on the basis of 
the frequency distribution of scores on that test, and were deter- 





* Mrs. Ingebord Vance and Mrs. Leone Katz also assisted in the compila- 
tion of the data and preliminary statistical work. 
t By Mr. Norton. 








4 The Journal of Educational Psychology 


mined by a standard grading curve giving the approximate pro- 
portion of students who should receive each grade. Thus the 
grade distribution on each test represents not only a standardized, 
but also, approximately, a normalized variable, since the stand- 
ard grading curve was based (roughly) on the normal curve. The 
criterion variable Y was based on the sum of the grades on these 
five tests. A sum of approximately normal variables, Y itself 
was found to be approximately normally distributed. 

Because of the conditions of scheduling, registration, and other 
administrative processes which make it difficult to carry on a 
controlled educational experiment in a large university situation, 
the experimental and control sections cannot be represented as 
random samples from any definable common population. For 
these reasons it was desirable to exercise statistical control 
over the nature of the two groups to be used for evaluating the 
experimental variable. The most easily available, as well as the 
most relevant variable for this purpose was the grade received 
in the preceding semester of the same course, Government 9. 
This grade is designated as X, the independent or control vari- 
able. It was available only in the gross number grades of the 
Purdue University grading system, with six possible values, and 
only five occurring in the sample used for analysis. However, 
this variable is believed to be entirely satisfactory for the pur- 
poses of statistical control for which it is used here. An addi- 
tional valuable property of the criterion variable Y is that the 
regression of Y on X was almost perfectly linear. 

Factors for the Analysis.—Table 1 gives in percentage form the 
distribution of frequency of voluntary attendance at the discus- 
sion groups for all members of both sections whose records were 
sufficiently complete to make them eligible for inclusion in the 
analysis. Table 1 also gives the break-down into four categories 
which was subsequently used for the discussion attendance vari- 
able, and the percentage distribution for this break-down. The 
upper two boundary lines for this break-down were chosen 
to conform, approximately, with natural valleys in the two 
distributions. 

The distributions in Table 1 invite interesting speculative 
hypotheses. In the first place, an almost identical percentage of 
cases in each section attended no discussion groups. One is 
tempted to speculate that these represent the students who are 








Frequent Testing as a Motivating Factor 5 


TABLE 1.—ATTENDANCE AT Discussion Groups: ALL ELIGIBLE 
Cases IN Botu SECTIONS 














Control Section Experimental Section 
Discus- 
sion 
Groups Vy Per bet come N Per a onl 
Attended | * Pte ein cent |. SOUP 
ing chosen ing chosen 
14 0 0 11 5.9 
13 l 1.1 14 7.5 
12 1 1.1 8 4.3 
11 1 1.1 |16 = 17.6% 10 5.4 |76 = 40.9% 
10 5 5.5 12 6.5 
9 3 3.3 9 4.8 
8 4 4.4 8 4.3 
7 l 1.1 4 2.2 
6 5 5.5 3 1.6 
5 5 5.5 119 = 20.9% 2 1.1 /21 = 11.3% 
4 5 5.5 9 4.8 
3 4 4.4 7 3.8 
2 10; 11.0 4 2.2 
1 13 14.3 |23 = 25.3% 17 9.1 |21 = 11.3% 
0 33 | 36.3 36.3%) 68 | 36.6 36.6% 
Total | 91 | 100 100% | 186 | 100 100% 























simply not interested in any activity in connection with the 
course which is not required of them, and the equal proportions 
suggest that this attitude is in no way affected by a motiva- 
tional device of the type experimented with here (frequent 
quizzing). Of the sixty-four per cent of the students who would 
and did attend one or more voluntary discussion groups, how- 
ever, there was a marked difference in behavior between the two 
sections. Of the students in the control section who would 
attend any discussion groups at all, only about one-quarter 








6 The Journal of Educational Psychology 


attended more than six. In all probability this is directly 
related to the fact that these students were given only the five 
one-hour tests during the course of the semester, as it has been 
noted above that the size of the attendance at the discussion 
groups seemed to be related to the dates of these tests. In con- 
trast, of the students in the experimental, frequently-quizzed 
section who would attend any discussion groups at all, nearly 
two-thirds attended seven or more. It seems clear that for those 
students who will participate at all in a voluntary activity of this 
sort, the motivational device experimented with here bears a 
significant and important relationship to frequency of attend- 
ance at the discussion groups. 

Several questions now arise. Is attendance at discussion 
groups related to achievement in the course, as measured by Y? 
If it is, and if the experimental section shows superior achieve- 
ment over the control section, to what extent can this superiority 
be explained by the more frequent attendance at discussion. 
groups motivated in the experimental section? Even more 
important, what are the answers to these same questions if the 
effects of the variable X (Government 9 grades) which, among 
other things, presumably measures initial interest and ability 
in this course, are partialled out? The analysis was designed to 
answer these questions. 

Design of the Analysis.—Since it was desired to make the com- 
parison between the achievement of the experimental section 
and that of the control section taking into account the factor of 
discussion attendance, it was necessary to classify the data 
according to a double-classification scheme. ‘Table 2 shows this 
scheme as a 4 X 2 table of eight cells: four classes on the vertical 
axis representing the four groupings of discussion attendance 
frequency given in Table 1 and two classes on the horizontal axis 
representing membership in the control or in the experimental 
section. 

The methods employed were those of analysis of covariance. 
Our application of these methods differs, however, from those 
more commonly occurring. We have pointed out, in the dis- 
cussion of Table 1, the marked dissimilarity in the distributions 
into discussion attendance classifications between the control and 
experimental sections. This is a situation that is frequently 
called the case of disproportionate frequencies (the ‘non-orthog- 











Frequent Testing as a Motivating Factor 7 


onal case’). The disproportionate frequencies situation requires 
special procedures which we shall not go into in detail. (See 
Snedecor® or Tsao.'!) We shall, however, discuss some of the 
basic considerations involved. * 
In the case of disproportionate frequencies one must choose 
between two basic assumptions on which to proceed:" 
Assumption A: We are interested in making inferences for a 
population which has equal frequencies in the cells formed 
by our classificatory factors; the unequal and dispro- 
portionate cell frequencies in our sample are merely a 
result of chance sampling effects, and are not character- 
istic of the population in which we are interested. 
Assumption B: The disproportionate cell frequencies in our 
sample are the result of characteristics inherent in the 
classificatory factors to be studied, and we are interested 
in inferring to a population whose cell frequencies are 
stratified approximately as are those in our sample. 
From our discussion of the dissimilar distributions in Table 1 
it is clear that Assumption A was not the one for us. We chose 
to assume instead that the dissimilarity in the two distributions 





* No attempt will be made here to go into detail on the theory and pro- 
cedures of these methods. A word is in order, however, on the intent and 
function of the analysis of covariance in an investigation of this sort. We 
have mentioned that we desired to exercise statistical control over the 
variable X—final grade in Government 9. In most research work in edu- 
cation and other social sciences, such control is achieved by matching cases 
in the two groups to be compared on the variable to be controlled. If this 
is done ex post facto, as it frequently must be,? there is usually a large 
resulting loss in cases which must be discarded because no match can be 
found for them in the other group or groups. The very great advantage of 
the methods of analysis of covariance in an investigation of this sort is that 
it achieves the effect of such individual matching on the variable to be 
controlled, without the necessity of carrying out the actual matching pro- 
cedures, and therefore without the loss of cases that usually attends such 
procedures. Furthermore, the application of these methods is readily 
extended to more highly complex designs, involving multiple-classification 
on several factors to be studied simultaneously (as in our case, where we 
have a design involving double classification on two factors to be evaluated 
simultaneously), whereas individual matching procedures become prohibi- 
tively unwieldy on such complex designs. The methods also extend to the 
statistical control of more than one variable, where again individual match- 
ing procedures become difficult, if not impossible. For additional informa- 
tion about covariance see references 6, 5, 3, and 9. 








8 The Journal of Educational Psychology 


in Table 1 was a direct result of the motivational device which 
constitutes the experimental teaching method, and this is cer- 
tainly just as true for the population to which we are interested 
in inferring as it is for the sample at hand. We could not, how- 
ever, make Assumption B and proceed on the basis of the dis- 
tributions as given in Table 1, either. To do so would have 
meant that we were assuming that it is always characteristic of 
the experimental, frequently-quizzed groups to run, on the aver- 
age, approximately twice as large as less frequently-quizzed 
groups like the control group! Clearly we did not want to 
include scheduling irregularities as part of our experimental 
hypothesis. 

This problem was resolved as follows: using a table of random 
sampling numbers “© or “©, a random sample of ninety-one 
cases was selected from the one hundred eighty-six eligible cases 
in the experimental section. Using these ninety-one cases, and 
the ninety-one eligible cases from the control section, we could 
then set up our table of double classification for analysis. This is 
given in Table 2, which shows equal total membership in the 
control and the experimental sections, thus obviating the diffi- 
culty referred to in the last paragraph, but nevertheless retains, 
within random sampling fluctuations, the dissimilar distributions 
of discussion attendance in the two sections. On such a table 
we were able to proceed under Assumption B. 

A study of Table 2 will be helpful in understanding some of the 
results of the later analysis. We observe, first, that the experi- 
mental section with a mean of 62.7 does seem to exceed sub- 
stantially the control section with a mean of 55.5 on the criterion 
variable Y. However, note that the experimental section also 
has a greater mean on the initial variable X than does the control 
section. This latter difference might substantially affect the 
comparison between the two groups on Y. Next note the pat- 
tern of the marginal X-means for the discussion attendance 
groupings. For those students who attended any discussion 
groups at all in this course, the average final grade in the preced- 
ing semester of the same course decreases regularly with decreas- 
ing attendance at such discussion groups. This would seem to 
lend support to the hypothesis that the X variable serves as a 
measure of interest in and/or conscientious endeavor with respect 
to this course, which, for these students, led them to attend one 











Frequent Testing as a Motivating Factor 9 


or more of the discussion groups. The students who attended 
no discussion groups, however, reverse this trend, and have an 
X-mean not far from the grand X-mean for all cases. This seems 


TABLE 2.*—DovuBLeE CLASSIFICATION TABLE: ALL ELIGIBLE CASES 
FROM CONTROL SEcTION; A RANDOM SAMPLE OF EquaL N 
FROM EXPERIMENTAL SECTION 


Control Experimental 
Section Section 
Infrequent Frequent 
Quizzing Quizzing Total 
7-14 N 16 (17.6%) 34 (37.4%) 50 
M, 4.31 4.21 4.24 
M, 63.6 65 .6 65.0 
N 19 (20.9%) 14 33 
Discus- M, 4.05 4.36 4.18 
dia M, 59.1 62.6 60 .6 
Attend- N 23 (25.3%) 10 (11.0%) 33 
ance M, 3.70 3.9 3.76 
M, 50.6 60.5 53 .6 
N 33 (36.3%) 33 (36.3%) 66 
M, 3.88 4.15 4.02 
M, 53.0 60 .4 56.7 
Total N 91 (100%) 91 (100%) 182 
M, 3.95 4.18 4.06 
M, 55.5 62.7 59.1 


*In the case of this and all subsequent tables, calculations were carried 
to several more significant digits than those given. 


to support the speculation, made above, that these students 
differed in some characteristic qualitative manner from the rest, 
whose observed effect is that they simply did not (perhaps could 
not) attend voluntary discussion groups, but that this does not 
necessarily imply lower interest and/or ability in this course than 
that displayed by students who did attend discussion groups. 
Because the two sections were not comparable on the X variable, 
and because the X variable seems to be related to discussion 
attendance for all those above the ‘ None’ group, it is of particular 








10 The Journal of Educational Psychology 


importance that we propose to ‘partial out’ by statistical control 
the ‘predetermination’ effects of X in making our comparisons. 
Note, finally, that the percentage distribution into discussion 
attendance groupings in the experimental section in Table 2 is 
similar to, but not exactly the same as that in Table 1. This, of 
course, is due to the sampling process in which we have engaged 
in order to obtain the ninety-one cases which represent the 
experimental method in Table 2. The largest discrepancy 
between the percentages in Table 1 and those in Table 2—that 
in the 3-6 discussion attendance group—is, however, no larger 
than might be expected by chance, with a t-value of 1.7 for the 
comparison between the proportion for the ninety-one cases in 
Table 2 and that for the complementary omitted group of ninety- 
five. This, of course, is to be expected, since the selection process 
was a random one. 

We have been discussing at some length considerations 
involved in the double classification design used here because 
this was necessary to set the stage for the choice of the sample 
actually used in the analysis, and also helps to understand what 
the design is intended to accomplish. The first steps in the 
analysis, however, did not involve the double classification 
scheme, but dealt with the two factors separately, as single 
classifications. That is, we compared the achievement of the 
experimental with that of the control section, ignoring the dis- 
tribution into discussion attendance groupings. Likewise, we 
studied the variation in achievement of the four discussion 
attendance groupings, ignoring membership in the experimental 
or control sections. Then, as the final step in the analysis, we 
dealt with the table of double classification, comparing the 
achievement of the experimental with that of the control section, 
taking into account the distribution into discussion attendance 
groupings. And vice versa. Comparison of the results in the 
single and double classification analyses then helped answer some 
of the questions we posed at the end of the previous section. 

Assumptions Underlying the Analysis.—Before passing to the 
discussion of the analysis itself, it would be well to discuss briefly 
the principal assumptions inherent in the application of the 
methods of analysis of covariance: 

1) That the regression of the criterion variable on the variable 
to be controlled (Y on X in our case) is fundamentally the same 


in each of the several cells. 








Frequent Testing as a Motivating Factor 11 


2) That this regression is linear. 

3) That the residuals after partialling out the effects of X are 
fundamentally normally distributed in each of the several cells. 

4) That these several normal distributions of residuals all 
have the same variance (assumption of homogeneity of residual 
variance). 

5) That the cases in our sample are random samples from 
each of these populations. 

Not all of these assumptions are equally crucial. There is 
empirical evidence, for instance, that ‘moderate’ departure from 
the normality assumption does not seriously affect the validity 
of the methods. The most crucial assumption is considered 
to be that involving homogeneity of residual variance. We have 
already mentioned that, for the total sample used for analysis, the 
variable Y was approximately normally distributed, and the 
regression of Y on X quite satisfactorily linear. This, of course, 
tells us nothing about the situation in each of the several cells 
of our double classification or about the residuals, but it lends 
plausibility to the assumption that the same conditions hold in 
each of the cells without resorting to an explicit test of these 
assumptions. 

Because the assumption of homogeneity of residual variance is 
considered to be most crucial, we performed an explicit test of 
this assumption based on the double classification design, since 
our most important conclusions were to be based on the results 
of the analysis of covariance in the double classification design. 
Using Hartley’s M-test, with the tables by Thompson and 
Merrington,!® the obtained J was found to be well within the 
five per cent level of significance for JJ, and the homogeneity 
assumption therefore tenable. Furthermore, homogeneity of 
residual variance in the eight cells of the table of double 
classification implies homogeneity of residual variance in the 
groups of each of the single-classification analyses performed 
first, since each of these groups is but a combination of some of 
the eight cells. 

The Single Classification Analyses.—Table 3 presents the two 
single-classification analyses—that on method, ignoring discus- 
sion attendance, and that on discussion attendance, ignoring 
method. We have seen that the experimental group had a 
Y-mean appreciably higher than that for the control section, but 








12 The Journal of Educational Psychology 


we have also seen that there was a difference between the two 
groups in the same direction on whatever student characteristics 
are measured by the variable X, which presumably include inter- 
est and/or ability in this Government course. To what extent 
was the difference between the methods groups on the criterion 
variable Y due to the predetermination effect of this non-com- 
parability in X, or rather, to what extent was it not due to these 
predetermination effects, and therefore, by implication, probably 
was due to the difference between the instructional methods used 
in the two sections? The answer is contained in Columns 11 and 
12 of Table 3. There we see that even with the effects of the 
X-variable partialled out, the F-ratio is highly significant, with 
a probability of arising purely by chance alone only somewhere 
between 14 and Mo of one per cent. We may conclude that, 
barring systematic effects that could bring about this result from 
variables not controlled in this study, the experimental, frequent- 
quizzing method did bring about significantly greater achieve- 
ment on the part of those students to whom it was applied. 
Note that nothing is said in this conclusion about how much, if 
any, of this superiority is due to the greater attendance at dis- 
cussion groups which, we have noted, was apparently motivated 
by the experimental, frequent-quizzing method. This question 
could not be answered until we did the double-classification 
analysis. 

Column 13 gives correlation coefficients which measure the 
degree of the relationship between the X and Y variables. In 
the row for ‘Total’ appears the coefficient which is the one we 
usually have in mind when we speak of the correlation between 
X and Y over this sample of one hundred eighty-two cases. 
However, we have noticed that the experimental section had 
both a higher X-mean and a higher Y-mean than did the control 
section—that is, some of this ‘total’ correlation is associated with 
group membership. The coefficient in the row for ‘Within 
Methods Groups’ is a correlation coefficient which measures the 
extent of the relationship between Y and X that is not associated 
with group membership. It is called, appropriately, the ‘within- 
groups correlation,’ as opposed to that portion of the total rela- 
tionship which is associated with ‘between-groups’ differences. 
That it is slightly smaller than the total correlation coefficient 
indicates, as we have already noted, that some small portion 





13 


. £69" O81 290% S6F6E O'TILI FE FST IST [%}0], 

= e890: 601 LLT §1S861 60898 O' 06ST LO 6FI 8LT sdnoip soue 

. -pus}}Vy uoIs 

> “Snosiqy UIYITM 

= 10 <d<¢20 69°¢ 268 g LLU “Ie O11 9¢ & ooUBpUe}}V 

S UOISSNOSICT 

S 869 OSI 22902 G6F6E O'IILI FE HST I8T IMO], 

eS 

3 889° 601 6LI 19861 LOILE 6 SE9T 16° IST O8T sdnoin 

8 Spoyyeyy UGTA 
100 <d<900 #68 926 I 916 6z82 ISL CS it (Aouanbeuy 

= zn) poy 

So M84 Ayyiqeqoig oh, aha siz 2iz (410 AZ siz fizz PY ed ‘fps a0IN0g 

- peonpey poonpey oss) ‘fpy 910j0q 

S 10} Jp siz {py Jp 

=, 10} Jp 

RS (€1) (Z1) (11) (OT) (6) (8) (2) (9) (¢) (¥) (¢) = (@) (T) (199) 


GONVGNGLLY NOISSHOSIG] GNV SGOHLAPY -SNOLLVOIMISSVTS) GIONIC YOd SASATVNY—’€ AIEV 









14 The Journal of Educational Psychology 


of the total relationship is associated with between-groups 
differences. 

The second portion of Table 3 gives the analysis for discussion 
attendance groupings as a single classification. We have seen 
in Table 2 that there is considerable variation in X-means for 
the discussion attendance groupings. The question arises again: 
To what extent is the variation in achievement among the several 
discussion attendance groupings, as measured by the criterion 
variable Y, not due to any predetermination effects of the student 
characteristics measured by X? The answer is contained in 
columns 11 and 12 where we see that, even after partialling out 
the effects of X, the variation in achievement among the discus- 
sion attendance groupings is moderately large, though not quite 
significant at the one per cent level, the probability for the 
obtained F having arisen by chance alone being somewhere 
between two and one-half and one per cent. We may conclude 
as follows: there was a tendency for students who differed on 
characteristics measured by X, which, as we have said, presum- 
ably include interest and/or ability in this course, to differ in their 
frequency of attendance at the voluntary discussion groups; when 
we have removed the effects of this tendency, it appears that 
frequency of attendance at discussion groups is still related to 
achievement in the course, though the relationship is only 
moderate, failing slightly to meet the one per cent level of 
significance. 

Column 13 is interpreted as indicated above. Note that the 
within-groups correlation falls farther below the total correlation 
than was the case in the methods analysis. This implies that 
more of the total relationship is associated with variation among 
discussion attendance groupings, than was the case for method- 
group differences. 

Let us now summarize the conclusions we have been able to 
draw so far from our experimental data. Even when the pre- 
determination effects of the student characteristics measured by 
X are partialled out, we have seen that there is still a highly 
significant difference in achievement in the course between the 
experimental and the control sections, and we have also seen that 
attendance at the discussion groups bears a relationship to 
achievement in the course. Now, however, we have observed 
in Tables 1 and 2 that discussion attendance behavior differed 











Frequent Testing as a Motivating Factor 15 


markedly between the two sections—the ‘attenders’ in the experi- 
mental, frequently-quizzed section were motivated to attend 
many more discussion groups, on the average, than were the 
attenders in the control section. The following question now 
arises: how much of the significant superiority in achievement 
of the experimental section over the control section can be 
explained by the greater attendance at discussion groups which 
was motivated in the experimental section? Can all of it be so 
explained? If not, how large (how significant) is that part which 
cannot be so explained? The double classification analysis to 
follow undertook to answer these questions. 

The Double Classification Analysis.*—The double classification 
analysis is presented in full in Table 4. Here the tests of the 
difference between the two methods groups are tests taking into 
account the variation in discussion attendance. Similarly, the 
tests of variation among discussion attendance groups are tests 
taking into account method-group membership. 

Columns 11 and 12 give the significance tests on variation in 
achievement after partialling out X. We see that the F-ratio 
for method has been drastically reduced from that in the single 
classification. This reduction is due to the fact that attendance 
at discussion groups is now being taken into account in the test 
of differences between the methods groups; the methods groups 
differed in their discussion attendance behavior, and discussion 
attendance was related to achievement in the course (Table 3). 
Similarly we see that the F-ratio for variation among discussion 
attendance groups is reduced over that in Table 3. 

The result of greatest interest to us, however, is that, even 
when we take differing discussion group attendance into account, 
and also partial out X, there still remains a superiority of the 
experimental group over the control group of considerable mag- 





* We shall not go into a detailed explanation of a new factor introduced 
in the double classification design—the so-called ‘Interaction’ effect between 
method and discussion attendance, because it plays no important réle in 
our results. It must be included, however, in order to make the analysis 
complete. In fact, in the analysis of Table 4 the Interaction effect was 
clearly insignificant, and was combined with ‘within-cells’ to yield a com- 
bined ‘residual’ error estimate to be used in testing the main effects of 
method and attendance at discussions. We are therefore dealing only with 
the two main effects in Table 4. 





= 189° 901 9LIT SILZ8I FHOFE O'GFST SB LIT LLI [snpisoy = 
— SPD UIYITAM 
a + uo1j0B10}UT 
> £69" C6F6E O'IILI FE FST IST 1830], 
Qa, 289° LOI ELI G68FST O8ShE ES 9SSI 69° 9FT FLT STI9D UIA M 
. os7<d<cL OL'O G&L ¢ 922 F9E L Zl 9I'T ¢ U01}0B19} UT 
S cOoo<d<0Ol 22 622 g 9€8 Z2ze.si« 98 L0°¢ ¢ soUBPUI}}V 
3 UOISSNOSI(T 
8 10 <d<20 86'S 989 I 989 rel 60h SST OT (“beay 
— zing) poy 
wo Aes Ayyiqeqoig “Ay “om fiz siz (4103 =, fiz siz hag ez py 90INOG 

S peonpey poonpery OsT[®) ‘fpy 910joq 

3 10} Jp riz ‘tpy 3p 

S 10} jp 

RS ($1) (21) (It) (OT) (6) (8) (2) (9) (¢) (F) (¢) (Z) (1) (1°90) 

ho q NOILdWwassy ‘ASVO IVNODOHLUO-NON 

- ‘HONVGNALLY NOISSHOSIGG X GOHLAJY ‘NOILVOIAISSVIQ AIANOG ‘:SISATVNY ALATANODQ—'f ATAV, 


16 












Frequent Testing as a Motivating Factor 17 


nitude, the obtained F-ratio having a probability of arising by 
chance only somewhere between two and one per cent. That is 
to say, when we remove the predetermination effects measured 
by X, and when we remove the effect of disproportionate distri- 
bution of discussion attendance, there still remains a superiority 
in achievement of the experimental section over the control sec- 
tion that barely falls short of being significant at the one per cent 
level. 

We are now in a position to answer the question which we posed 
ourselves at the end of the preceding section. It is clear, first, 
that frequent quizzing did bring about greater achievement in 
this course. But we can now say, further, that while some of 
this greater achievement can be explained by the greater attend- 
ance at discussion groups (on the part of attenders) which the 
frequent quizzing apparently motivated, by no means all of it 
can be so explained; there remains a superiority which cannot 
be so explained that falls between the two and the one per cent 
levels of significance. 

Indeed, we can do more; we can trace the principal source of 
this remainder. In Table 5, in the column headed W,D,, we see 


TABLE 5.—DovuBLE CLASSIFICATION, METHOD x DISCUSSION 
ATTENDANCE: EXTRACTED PORTION FROM TABLE FOR 
ESTIMATION OF SUMS OF SQUARES AND PRODUCTS 





Discussion ? NaNre 5 , ' 

Attendance "*™ N; D, = Ya — Fn W-D,, 
7-14 10.88 +1.96 +21.4 
3- 6 8.06 +3.47 +27.9 
l- 2 6.97 +9.93 +69.2 
None 16.5 +7.39 +122.0 
Total 42.41 +240.5 


that by far the largest contribution to =W,D,, comes from the 
non-attenders, and the next largest contribution comes from the 
low attenders in the 1-2 group. This means that the major por- 
tion of the methods difference which remains significant in the 
double classification of Table 4 is due to these two groups. This 
may be interpreted as follows: the high attenders in the experi- 
mental section had high achievement, but they were expected 








18 The Journal of Educational Psychology 


to have high achievement; their counterparts in the control sec- 
tion also had high achievement, so that the high attenders 
contribute little to the methods difference in the double classi- 
fication of Table 4 (although they contributed a great deal to the 
methods difference in the single classification of Table 3, where 
discussion attendance was not taken into account, because there 
were a great many more of them in the experimental section than 
there were in the control section); on the other hand, low- and 
non-attenders in the experimental section apparently achieved a 
great deal more than their counterparts in the control section, 
and it is these groups that account for the significant methods 
difference that remains in Table 4. We are thus led to speculate 
that the frequent quizzing method, besides motivating greater 
attendance at discussion groups on the part of attenders, also 
motivated greater outside endeavor with respect to the course, 
which we have no measure of in this study; moreover this greater 
outside endeavor shows up dramatically in the case of the low- 
and non-discussion group attenders, which accounts for the 
significant methods difference still remaining when discussion 
attendance is controlled in Table 4. 

It is of considerable interest, from the point of view of statis- 
tical methodology, to note that the question which we posed 
ourselves at the end of the last section, and have now answered, 
could not have been answered without resorting to the non- 
orthogonal case, Assumption B. Even those research workers 
who occasionally use the non-orthogonal case tend to look upon 
it as an inconvenient nuisance, made necessary by bad luck, or 
bad planning, or both, that would have been better avoided. 
This may frequently be true, especially when Assumption A is 
the appropriate one (that unequal cell frequencies are merely 
chance sampling effects). But here we had a situation where the 
disproportionate frequencies are clearly characteristic of the 
classificatory factors under study, and proceeding under the 
non-orthogonal case, Assumption B, we have been able to answer 
a question about our experimental results which could not other- 
wise have been answered. 

Since the test of discussion attendance variation in Table 4 is 
not of major interest in drawing conclusions about our experi- 
ment, we shall not discuss it in detail, but shall only note that, 
when group membership is controlled, as well as X partialled out, 








Frequent Testing as a Motivating Factor 19 


we see in Column 12 that the variation among discussion attend- 
ance groups drops below the five per cent level of significance. 
Column 13 may be interpreted as previously indicated. 


SUMMARY AND CONCLUSIONS 


A non-orthogonal case (disproportionate frequencies) of analy- 
sis of covariance is demonstrated in the investigation of an 
independent educational variable in relation to another variable 
which is differentially distributed in the experimental and control 
groups. 

Frequently-quizzed students in Government have significantly 
higher achievement than students receiving only monthly quizzes, 
even after a predetermining index of ability in Government has 
been partialled out. 

This superiority is accounted for only in part by voluntary 
attendance at discussion groups in Government conducted in 
addition to class lectures. Even when discussion attendance 
as well as predeterminers of ability are taken into account, the 
superiority of the frequently-quizzed students is a significant 
one. Much of this superiority is attributed to the greater 
achievement of frequently-quizzed students who attended no or 
few discussion groups over the achievement of non-frequently- 
quizzed students who attended no or few discussion groups. 

Additional evidence suggests that frequent quizzing motivates 
some students to attend more discussion groups. 

On the basis of these findings we may conclude that frequent 
testing of achievement in the college lecture classroom may moti- 
vate such outside endeavor as will result in superior achievement. 
The instructor who wishes to use such motivational learning 
devices to accompany lectures would do well to make available 
instructional supplements such as the extra discussion groups of 
this study or other instructional materials and experiences closely 
correlated with course content. 


BIBLIOGRAPHY 


1) W.F. Book and L. Norvell. ‘‘The will to learn,”’ Pedagogical Seminary, 


1922, 29, 305-362. 
2) E. Greenwood. Experimental Sociology, New York: Kings Crown 


Press, 1945. 








20 The Journal of Educational Psychology 


3) R. W. B. Jackson. Application of the Analysis of Variance and 
Covariance Method to Educational Problems, Bulletin 11, Dept. of Educ. 
Research, University of Toronto, 1940. 

4) N. Keys. ‘‘The influence on learning and retention of weekly as 
opposed to monthly tests.” J. Educ. Psychol., 1934, 25, 511-20. 

5) E. F. Lindquist. Statistical Analysis in Educational Research, Boston: 
Houghton-Mifflin, 1940. 

6) Q. McNemar. Psychological Statistics, New York: John Wiley & Sons, 
1949. 

7) Maxine Merrington and Catherine M. Thompson. ‘Tables of per- 
centage points of the inverted Beta (F) distribution,’’ Biometrika, 1943, 33, 
75-88. 

8) L. Panlasigui and F. B. Knight. ‘‘The effect of awareness of success 
or failure,”’ T’wenty-Ninth Yearbook of the National Society for the Study of 
Education, Part II, 1930, 611-619. 

9) G. W. Snedecor. Statistical Methods. 4th Ed. Ames, Iowa: Iowa 
State College Press, 1946. 

10) Catherine M. Thompson and Maxine Merrington. ‘‘Tables for 
testing the homogeneity of a set of estimated variances,”’ Biometrika, 
1946, 33, 296-304. 

11) F. Tsao. ‘‘General solution of the analysis of variance and covari- 
ance in the case of unequal or disproportionate numbers of observations in 
the subclasses,’’ Psychometrika, 1946, 11, 107-128. 

12) A. H. Turney. ‘‘Effect of frequent short objective tests upon the 
achievement of college students in Educational Psychology,” Sch. & Soc., 
1931, 33, 760-762. 

13) Encyclopedia of Educational Research. The American Educational 
Research Association. New York: The Macmillan Co., 1941, p243. 








A VALIDITY STUDY OF PERSONALITY QUESTION- 
NAIRES AT THE UPPER ELEMENTARY GRADE 
LEVEL* 


WILLIAM J. LODGE 
Professor of Education Chico State College 


THE PROBLEM 


This study appraises the validity of a direct and an indirect 
form of a personality and attitude questionnaire at the fifth-, 
sixth-, and seventh-grade levels, as measured by correlations 
with external criteria and by an analysis of responses to the 
individual items. The study also offers evidence bearing on the 
effects on questionnaire responses at these grade levels of the 
requirement or non-requirement of signatures. 

The experiment was designed to evaluate and extend a study by 
Ellis! in which two forms of direct and two forms of ‘equivalent’ 
indirect questions were combined into one measuring instrument 
in an effort to determine which form could best distinguish 
between groups of ‘normal’ and ‘problem’ seventh- and eighth- 
grade boys. None of the four forms did so distinguish, but the 
indirect type used in the present investigation was regarded by 
Ellis as the most promising. Examples of the two types employed 
in the present study are as follows: 

Direct: I cry 
Very often Pretty often Seldom Never 
Indirect: Children who often cry are 
Very queer Pretty queer A little queer Not at all queer 

Ellis’! theory, (4) which is not confirmed by the present experi- 
ment, is that children who themselves possess the characteristic 
of frequent crying will classify other children who often cry as 
‘Not at all queer,’ and that, in effect, the child, while ostensibly 
rating other boys and girls, will actually be revealing his own 


characteristics. 
EXPERIMENTAL PROCEDURE 


The questionnaire employed by the writer consists of forty- 
three items, thirty-one of which were included in the total of 





* Summary of a Ph.D thesis on file in the University of California Library, 
Berkeley, entitled A Study of Some Factors Affecting Responses on Personality 
Questionnaires. 


21 





% 








22 The Journal of Educational Psychology 


thirty-six used by Ellis.‘* They were obtained by him from a fre- 
quency count of the items employed in eight popular personality 
inventories. Five Hartshorne and May® ‘ringer’ itemsf and 
seven cheating attitude questions based on a review of the litera- 
ture were added by the writer. The foregoing forty-three items 
were drawn in random order and were then written in direct and 
indirect form respectively, in the order drawn by chance. Split- 
half Pearson r’s based on one hundred and one cases for the direct 
form and one hundred and five for the indirect were .83, P.E. = 
.02 and .86, P.E. = .02, respectively. 

The comparison between the two forms of the questionnaire is 
based on the administration of the instrument by the writer to 
two groups of two hundred forty-four children each, matched for 
grade, sex, IQ, reading comprehension and socio-economic status. 
The subjects were drawn from five California public schools. 
One-half of the children in each group were tested with signatures 
required and the other half anonymously, with identification 
subsequently accomplished by the experimenter by means of a 
pinprick code system. The questionnaires were administered 
a second time one week later by the classroom teachers. 

The ‘personality’ items on the questionnaire were scored in 
accordance with the system developed empirically by Ellis’; i.e., 
one point was scored on the direct form for each ‘Very often’ or 
‘Pretty often’ answer and no points for each ‘Seldom’ or ‘Never’ 
response. ‘The indirect form was scored one point for each ‘Not 
at all queer’ reply and no points for the ‘Very queer,’ ‘Pretty 
queer,’ and ‘A little queer’ answers. The cheating attitude 
questions were scored in the same way except that ‘Seldom’ 
answers on the direct form were also scored one point, since they 
obviously constitute a cheating admission. 

A social status score for each child was determined by the 
administration on the fourth day of a sociometric instrument by 
the classroom teachers. This instrument employed the three 
criteria used by Flotow,? involving up to three choices of children 
with whom the child would like to work and play and next to 
whom he would like to sit. Flotow reports, in non-quantitative 
terms, a high degree of validity and reliability. The directions 





* By permission of the American Psychological Association, Inc., and 
Psychological Monographs. 
+ By permission of the Macmillan Company. 











A Validity Study of Questionnaires 23 


to the children followed closely the pattern recommended by 
Moreno‘ to obtain maximum validity. 

Each child’s social status score as employed in this study was 
determined by calculating the total number of times he was 
chosen by others, with specific criterion and one-two-three order 
of choice disregarded. 

The Van Wagenen Unit Scales of Achievement: Reading Com- 
prehension Tests® were administered on the twenty-first day for 
the dual purpose of obtaining standardized reading comprehen- 
sion scores for matching and correlational purposes and to con- 
trol a classroom cheating experimental situation. The papers 
were scored by the experimenter with no marks recorded on them 
and then returned promptly to the children, ostensibly for self- 
scoring. A cheating behavior rating was subsequently deter- 
mined for each child by dividing the difference between his true 
score and a perfect score into the number of points he raised his 
score, if any. 

Finally, socio-economic ratings were determined by converting 
parent’s occupation in accordance with the Barr Seale Ratings of 
Occupational Status® (66-72). 


STATISTICAL FINDINGS 


Table I shows that the direct and the indirect forms of the 
questionnaire are significantly different according to a rigid chi- 
square test which was used because of the skewed distribution of 
the data. Correlations of various sections of the questionnaire, 
however, with external criteria (Table III) suggest that neither 
form possesses much validity as measured. The r’s in Table III 
indicate some superiority for the indirect form, but positive r’s 
between personality problems as scored by the indirect form and 
intelligence, reading comprehension, and socio-economic status 
(Table IV) are interpreted as meaning that, if the theory held by 
Ellis! and by others is tenable, superior children tend to have a 
greater number of personality problems than do those below the 
mean. It might be maintained that the more intelligent children 
are more self-critical, which would be reflected in higher ‘neurotic’ 
scores, but neither the literature generally nor the item analysis 
(Tables V, VI) particularly supports such an hypothesis. 

The data suggest that many children do not, in effect, rate 
themselves when they respond to most of the indirect items but do 








24 The Journal of Educational Psychology 


TaBLE I.—CoOMPARISON OF THE Direct FoRM oF PERSONALITY 
QUESTIONNAIRE WITH THE INDIRECT FoRM: UNFAVORABLE 








RESPONSES 
. , Chi 
D Ind , 
irect ndirect Scene Signif- 
Category of P| icance 
N |Mean| N |Mean Differ- Level 
ence 





Personality Score, 
Ist Testing* 244| 6.20 |244)11.93) 64.24 |.01/ 1 per cent 


Personality Score, 
2nd Testing* 244) 5.29 |244)/12.64) 65.01 |.01] 1 per cent 


Cheating Attitudes, 

Ist Testing 244| 1.79 |244; .95) 6.74 |.05) 5 per cent 
Cheating Attitudes, 

2nd Testing 244| 1.90 |244) 1.05) 8.00 |.02) 5 per cent 


























* Includes the following sub-categories, each of which were significantly 
different at either the one or five per cent level; fears, school adjustment, 
sensitivity and excitability, psychosomatic symptoms, and social adjustment. 


TABLE II.—CoOMPARISON OF THE ANONYMOUS GROUP WITH THE 
SIGNATURES-REQUIRED GrRouP: UNFAVORABLE RESPONSES 
TO THE PERSONALITY QUESTIONNAIRE, First TESTING 





Signatures Chi 





Anonymous Required Square Signif- 
Category of P | icance 
Differ- Level 


N |Mean| N | Mean! once 





Personality 
Score* 244 | 8.72 | 244 | 9.42 | 3.62 .20; NS 
Cheating 
Attitudes | 244 | 1.14 | 244] 1.59 | 2.29 .30 | NS 


























* Includes the following sub-categories: fears, school adjustment, sensi- 
tivity and excitability, psychosomatic symptoms, and social adjustment. 











A Validity Study of Questionnaires 





25 


TABLE III.—CoMPARISON OF SECTIONS OF THE INDIRECT AND 
Direct ForMS OF THE PERSONALITY QUESTIONNAIRE 
THROUGH CORRELATIONS WITH EXTERNAL CRITERIA: 
First TESTING 

















Tetrachoricr| Chi 
Questionnaire External | ,. Square Signif- 
Section Criterion m In beg » | seanes 
Direct! ... | Differ- Level 
direct | once 
Social Adjustment | Sociometric 
Problems Rating (488)+.20 |—.21 | 39.52 |.01| 1 per cent 
Cheating Attitude | Cheating 
Admissions Behavior |403)/+.08 |+.25 | 1.03 |.90 NS 
School Adjustment | Intelligence 
Problems Quotients |488)+.17 |—.19 | 49.07 |.01| 1 per cent 


























TABLE IV.—CoRRELATIONS OF TOTAL PERSONALITY SCORES* OF 
THE DirEcT AND INDIRECT FORMS OF THE QUESTIONNAIRE, 
RESPECTIVELY, WITH EXTERNAL CRITERIA: First TESTING 











Tetra- Chi 
: choric r_ | Square Signif- 
External N of P icance 
Criterion ; Indi- | Differ- Level 
Direct rect | ence 
Intelligence Quo- 
tients 488} —.15| +.31]| 69.05 | .01/ 1 per cent 
Socio-Economic 
Status 488 | —.13/ +.06| 69.77 | .01/ 1 per cent 
Reading Compre- 
hension 488 | — .04/ +.28/ 53.07 | .01| 1 per cent 























* ‘Total personality score’ includes fears, school adjustment, sensitivity 
and excitability, psychosomatic symptoms, and social adjustment. 


rate other children, as instructed, and that the superior children 


4, 


in many cases do so rather tolerantly. One finds it difficult to ‘- 
believe, for example, that out of two hundred and forty-four 
children responding to the indirect form, approximately one-half 
or more (1) find it difficult to speak out in class or do school work 








26 The Journal of Educational Psychology 


TABLE V.—NUMBER OF UNFAVORABLE RESPONSES TO THE PER- 





SONALITY AND CHEATING ATTITUDE ITEMS OF THE QUES- 


TIONNAIRE: First TEsTING* 


Direct 


Nature of Item 


Finds it hard to speak out in class 

Would rather play by self than with others 

Often becomes gloomy or sad 

Is afraid of high places 

Often has headaches 

Dislikes school 

Cheats to get good marks and please parents 

Often has horrible dreams while sleeping 

Cheats to get good marks and keep up with 
friends or brothers or sisters 

Cheats if knows that other children are 
cheating 

Feels that teachers treat him badly or 
unfairly 

Feels that people often find fault with him 

Is afraid of being followed when walking on 
the street 

Is afraid of fire 

Often feels ill or weak 

Often day-dreams 

Cheats if believes will not be caught 

Is often bothered with pains in some part of 
his body 

Feels that nobody really loves him 

Cheats on tests in school if the tests are hard 
and unfair 

Cheats to get good marks and please the 
teachers 

Feelings are easily hurt 

Finds it hard to become a leader 

Is afraid of the dark 

Often feels tired when he wakes up in the 
morning 

Stutters or stammers 


85 
18 
49 
58 
50 
48 
69 
31 


48 


62 


14 
55 


43 
80 
26 
87 
51 


45 
17 


84 


45 
100 
77 
33 


105 
14 


Indirect 
N = 244 N = 244 


140 
40 
92 

126 

148 
76 
32 

113 


33 


32 


61 
53 


81 
148 
135 

59 

20 


147 
34 


34 


25 
61 
107 
69 


116 
103 








A Validity Study of Questionnaires 27 


TABLE V.—Continued 
Direct Indirect 


Nature of Item N = 244 N = 244 
Has habit of twitching his neck or face 13 46 
Bites his nails 85 53 
Finds it hard to get along in school 25 57 
Cheats if thinks the teacher is unfair 72 30 
Often becomes excited 125 120 
Finds it hard to make friends 20 81 
Other children refuse to play with him 17 70 
Often cries 29 67 
Finds it hard to do his school work 47 110 
Is afraid during bad storms 24 96 
Often feels tired during the daytime 33 106 


* ‘Ringer’ items are omitted. 


right, (2) often have headaches, pains in some part of the body, or 
feel ill or weak, (3) often have horrible dreams while sleeping, 
wake up tired in the morning, and remain so all day, (4) have 
their feelings easily hurt, and (5) are afraid of high places and of 
fire. 

One may also doubt that in a group of two hundred and forty- 
four ‘normal’ fifth-, sixth-, and seventh-grade children (1) more 
than one-third are afraid of being followed on the street, (2) more 
than one-third stutter or stammer, (3) other children refuse to 
play with almost one-third of the group, (4) more than one-fifth 
bite their nails, and (5) almost one-fifth habitually twitch their 
necks or faces. 

Table V indicates that the approximate two-to-one ratio 
between number of alleged personality problem ‘admissions’ 
on the indirect and direct forms, respectively, was reversed in the 
case of cheating attitude questions. The reasons for this rever- 
sal are not clear from the available data, but it is hypothesized 
that many of the children in the indirect group did make the 
identification and transfer when confronted with threatening 
items, and, realizing the implications involved in the cheating 
questions, cautiously made fewer admissions than did the direct 
group. It seems probable that the indirect method will yield some 
‘neurotic’ admissions that the direct method, with its bludgeon- 
like attack, would miss. But the indirect approach employed in 


" 














28 The Journal of Educational Psychology 


this investigation will score a great many ‘normal’ responses as 
‘neurotic’ or ‘unfavorable.’ In more concrete terms, a child, for 
example, who replies that children who are afraid of high places 
are ‘Not at all queer’ is not necessarily, or even probably, admit- 
ting that he himself is afraid of high places. 


TaBLE VI.—NvuMBER OF UNFAVORABLE RESPONSES TO THE 
‘RINGER’ ITEMS OF THE PERSONALITY QUESTIONNAIRE: 
First TESTING 


Direct Indirect 


Nature of Item N = 244 N = 244 

Keeps other children quiet when the teacher 

is out of the room 99 192 
Picks up broken glass lying in the street 72 156 
Smiles when things go wrong 92 112 
Reports to the police the license numbers of 

speeding automobiles 27 162 
Reports to the teacher other children whom 

he sees cheating 33 69 


Subject to further research involving the effects on the results 
of an examiner who is a stranger to the children and/or the 
influences of the individual classroom teachers, Table II points to 
the conclusion that anonymity of responses is not a major factor 
in personality and attitude testing at the grade levels here con- 
sidered. In fact, differences, while not significant by chi-square 
test, indicate a greater number of unfavorable admissions for the 
signatures-required group for both the personality and cheating 
attitude items. 

SUMMARY AND CONCLUSIONS 


This study evaluates certain aspects of the validity of a direct 
and of a parallel indirect form of personality questionnaire, most 
of the items for which originated with eight popular commercial 
personality questionnaires. The direct form openly asks the child 
about himself, while the indirect form requests his opinion of other 
children. The theory advanced by several experimenters but not 
confirmed in this study, is that children, although ostensibly 
rating others, will actually reveal their own characteristics. 
Finally, additional evidence is offered regarding the controversial 








A Validity Study of Questionnaires 29 


issue of the effects on responses to personality questionnaires of 
the requirement or non-requirement of signatures. 

Four hundred and eighty-eight fifth-, sixth-, and seventh-grade 
California public school children were matched in groups of two 
hundred and forty-four each in order to compare the direct form 
of the questionnaire with the indirect and the requirement or 
nonrequirement of signatures. Questionnaire scores for each of 
the matched groups were compared directly and were correlated 
with sociometric ratings, intelligence, reading comprehension, 
socio-economic status, and cheating behavior as measured 
experimentally. 

Correlations of the social adjustment, school adjustment, and 
cheating attitude sections of both the direct and the indirect forms 
of the questionnaire with sociometric ratings, intelligence and 
cheating behavior, respectively, and of total questionnaire scores 
with intelligence, reading comprehension, and socio-economic sta- 
tus indicate (by these criteria) virtually zero validity for the direct 
form and a very low degree of validity for the indirect. These 
correlations, when considered with the item analysis and the 
significantly different mean scores for the two forms strongly sug- 
gest—if they do not conclusively prove—that a number of chil- 
dren sufficient to invalidate the indirect form of the instrument 
respond as instructed with their opinion of others and generally 
do not make the personal identification suggested by several 
previous studies. The blunt, direct form of the questionnaire 
is regarded as invalid and unpromising and the indirect form as 
requiring considerable refinement before it can be applied uncriti- 
cally. Many individual questionnaires which might be useful toa 
skilled clinician were undoubtedly to be found among both the 
direct and the indirect groups in this investigation. But the 
writer joins many previous investigators in cautioning against 
the making of unwarranted assumptions regarding the validity 
of this type of instrument for survey purposes. 

The mean number of ‘neurotic’ or unfavorable responses by the 
signatures-required group exceeded that of the anonymous group, 
but differences were not significant. Subject to further research 
involving the effects on responses of an experimenter who is a 
stranger to the children, and of the classroom teachers concerned, 
it is concluded that anonymity of response is a minor factor in 
personality questionnaire testing at these grade levels. 





“ 











30 The Journal of Educational Psychology 


BIBLIOGRAPHY 


1) A. Ellis. A Comparison of the Use of Direct and Indirect 
Phrasing in Personality Questionnaires, Psychological Mono- 
graphs Number 284. Washington, D. C.; The American Psy- 
chological Association, Inc., 1947. 

2) E. A. Flotow. ‘Charting Social Relationships of School 
Children,” The Elementary School Journal, 46:498-504, May, 
1946. 

3) H. Hartshorne and May M. Hartshorne. Studies in 
Deceit. New York: Macmillan, 1938. 

4) J. L. Moreno. Who Shall Survive? Nervous and Mental 
Disease Monograph, Series 58, Washington, D.C.: Nervous and 
Mental Disease Publishing Company, 1934. 

5) L. M. Terman et al. Genetic Studies of Genius, Volume I. 
Palo Alto: Stanford University Press, 1925. 

6) M. J. Van Wagenen. Unit Scales of Attainment in Reading 
Comprehension. Minneapolis: Educational Test Bureau, Inc., 


1934. 











ACADEMIC ACHIEVEMENTS OF VETERANS AND 
NON-VETERANS AT THE CITY COLLEGE OF 
NEW YORK* 


LOUIS LAURO anp JAMES D. PERRY 


Division of Testing and Guidance 
The City College of New York 


INTRODUCTION 


The question of Federal aid to education and its extension to 
higher education is still an unresolved issue. It seems appropri- 
ate, therefore, to gather as much information as possible on the 
most recent large-scale subsidization of education through the 
"6. 1. aoe.” 

The consensus of college professors seems to be that the vet- 
erans are better students than their non-veteran classmates. 
This assumption is supported by published reports, but in many of 
these studies certain essential controls needed to make this 
conclusion a valid one have not been introduced. 

Unless a non-veteran group is introduced, veterans’ scholastic 
performance can be compared only with their pre-service records, 
and observed differences might be attributable to factors other 
than the service experience. Studies which did not introduce a 
non-veteran control group are of limited interpretative value.'® 
If a non-veteran group is to be introduced, the factor of scholastic 
aptitude must also be controlled. This was done by one investi- 
gator through the use of analysis of co-variance techniques,'? by 
others through matched control groups,” while still others merely 
reported mean scholastic aptitude test scores or high-school 
averages for the groups under consideration.*:'6 These investi- 
gators reported veterans to be equal to, or superior to, non- 
veterans. One report, especially noteworthy because it comprised 
several thousand students in colleges throughout the country, 
indicated that differences between veterans and control groups of 
non-veterans were consistently small and in favor of the veterans 
in the majority of cases.!2. In aspecial program for veterans with 
irregular scholastic records, the veterans achieved a higher mean 
college average than did women in regular attendance at the 





* The writers wish to thank Dr. Louis Long for his many constructive sug- 
gestions, and the staff members of the Veterans Counseling Office for their 
assistance in recording the data upon which this study is based. 

31 











32 The Journal of Educational Psychology 


college, in spite of the fact that the mean high-school average 
of the women was above that of the veterans. 

The failure of some investigators*”:*!° to control the factor of 
scholastic aptitude leaves as an unknown a variable which might 
affect results. These investigators did find slight differences in 
favor of veterans, but only one® considered the statistical signif- 
icance of the difference and in this case the difference was not 
significant. Similarly, in a study which made reference to 
scholastic aptitude measures for only part of the cases, a very 
slight difference in favor of veteran students was reported, but a 
test of significance was lacking.'4 

On the assumption that students’ performances in certain 
college courses are indicative of over-all academic achievements, 
one course might be selected as a focal point for investigation. 
However, failure to substantiate the assumption severely limits 
the interpretation of the results. For this reason, it appears 
unjustified to extend to over-all achievement the finding that 
veterans surpassed non-veterans in Freshman English.'* An 
investigation centering about grades in general chemistry must be 
considered inconclusive since there were only eight subjects in the 
non-veteran group.* 

An attempt to demonstrate that age was the only factor making 
for the slight superiority of veterans indicated that non-veterans 
made better grades than did veterans of the same age.!! How- 
ever, when one considers the upward trend of grades through the 
later college years, it becomes apparent that the failure to take 
grade placement into account biases the results against veterans. 
Furthermore, other investigators found no correlation between 
age and academic achievement of veterans® or slight negative 
correlations for both veteran and non-veteran groups.’ 

There is a good deal of variation in the relative scholastic 
achievement of veterans and non-veterans from college to col- 
lege. No institution reported any very dramatic results one way 
or the other; however, slight differences in favor of the veterans 
seem tobe the norm. Age does not appear to be a factor in caus- 
ing these differences. Apparently the service experience, per se, 
does contribute to them. 


SCOPE OF THE STUDY 


The purposes of the present study are: (1) to investigate the 
average grade-point differences between pre- and post-service 











a 











Achievements of Veterans and Non-Veterans 33 


periods of veterans attending City College (2) to compare vet- 
erans’ averages with non-veterans’ averages, and (3) to determine 
whether there are any differences between veterans and non- 
veterans with respect to choice of course of study (including 
nature of changes in study program). 

The subjects of this investigation are students who were 
enrolled as lower seniors in the fall semester of 1948 in one of the 
following courses of study: science, social science, or technology. 
The records of six hundred twenty-nine male students were ob- 
tained. However, a number of these students were not included 
since they fell into one or more of the following categories: 

1) Students who did not attend City College before entering 
service. They had either entered college for the first time after 
their return from service or had taken a full semester’s work or 
more at another institution. 

2) Students who had attended Evening Session at City College 
before entering the service or who had less than ten pre-service 
credits. 

3) Students who were taking training under Public Law 16 (the 
law covering training of disabled veterans). 

4) Students who were twenty-one years of age or more at time 
of entrance. 

Data relating to the final sample will be found in Table 1. 


TABLE 1.—DESCRIPTION OF FINAL SAMPLE 


Classifica- 
tion of Original Number Final 
Course Students Sample Excluded Sample 

Science Veterans 66 18 48 
Science Non-Veterans 44 2 42 
Social Science Veterans 104 45 59 
Social Science Non-Veterans 85 13 72 
Technology Veterans 240 120 120 
Technology Non-Veterans 90 17 73 


Most of the veterans entered the College between September, 
1941, and September, 1944, while most of the non-veterans 
entered between February, 1945, and February, 1946. The mean 
age at entrance for the two groups was approximately seventeen 
years and four months, with standard deviations varying from 














34 The Journal of Educational Psychology 


seven to ten months. High-school averages and scores on the 
ACE Psychological Examination were recorded for all those 
students for whom thisinformation wasavailable. In Table 2, the 


TABLE 2.—MEAN HIGH-SCHOOL AVERAGE AND MEAN SCORE ON 
ACE PsycHoLoGicat EXAMINATION FOR VETERANS AND 
Non-VETERANS 





Veterans Non-Veterans 
Measure of 


Course Scholastic d 
Aptitude |N| M |SD|N| M /SD = |P 











—— —_—_— 





Science ACE 40)121.8)18.4) 39|124.7\21.7| .64).26 

High-school 

Average 38} 83.7) 4.0) 40) 85.1) 4.5)1.46).07 
Social ACE 55)116.8)16.8) 70)117.0)22.1) .06).48 
Science High-school 

Average 50) 82.8) 3.9) 68) 83.4) 4.8) .75).23 
Technology| ACE 97|119.7|16.7| 65)126.0)15.9/2.42).01 


High-school 
Average 94; 83.8) 3.8) 63) 84.9) 4.1/1.69).05 
































mean high-school average and the mean score on the ACE Psy- 
chological Examination are listed for veterans and non-veterans. 

The differences between the means and the reliabilities of these 
differences are also presented in Table 2. The greatest difference 
between veterans and non-veterans is found in the technology 
group, where significant differences between the mean high-school 
averages and mean scores on the Psychological Examination are 
found in favor of the non-veterans. These differences may be 
due to variations in admission standards. 


METHODS OF TREATING THE DATA 


The effect of variations in amounts of college work, taken in the 
pre- and post-service periods, upon academic achievement was 
partially controlled by dividing the veterans into two groups. 
Group A consisted of those veterans who had completed eleven 
to thirty-two credits in the pre-service period. Group B con- 
sisted of those who had completed more than thirty-two credits 








Achievements of Veterans and Non-Veterans 35 


in the pre-service period. In order to compare Groups A and B 
in terms of scholastic ability, the mean of the scores on the 
Psychological Examination was computed for each group, accord- 
ing to course of study undertaken. The difference between the 
means of Groups A and B was 0.1 for the science students, 4.1 for 
the social science students, and 1.6 for the technology students. 
None of these differences were found to be statistically significant. 

The pre-service grade-point average and the post-service 
grade-point average for veterans were determined by dividing 
the grade-point* by the number of credits taken during the appro- 
priate period. 

In order to study the question of whether college grades tend to 
improve as students progress in their college work, the college 
grade-point average of the non-veterans was determined at the 
end of the first, second and third years. The procedure used in 
calculating these grade-point averages was the same as that for 
the veterans. 


RESULTS 


A trend among veterans toward professional and technical 
training, found by Tibbets and Hunter," is also apparent among 
the veterans in this study. Of the three courses of study which 
students in this investigation were taking, technology was being 
taken by more than a proportionate number of veterans, while the 
non-veterans had tended to take science and social science 
courses. This trend held despite the fact that many veterans 
had transferred from the School of Technology to other courses of 
study. Table 3 indicates the number that have transferred to 
science or social science in their most recent degree change. The 
reason that the number of veteran technology students still 
remains high, despite this drainage, is that the vast majority 
of them entered college during the war years, when interest in 
engineering was at its peak. The Registrar’s tabulation of 
degree choices of entering freshman for the semesters beginning 
February, 1942, to February, 1944, shows that forty-four to forty- 





* The grade-point was computed by giving each credit of A a weight of plus 
two, each credit of B a weight of plus one, each credit of C a weight of zero, 
each credit of D a weight of minus one, and each credit of F a weight of 
minus two; the sum of the weighted credits is the grade-point. 


4 











36 The Journal of Educational Psychology 





TABLE 3.—NaATURE OF Most RECENT CHANGE OF COURSE BY 


VETERANS AND NON-VETERANS 











Course Veterans Non-Veterans 
Per Cent Per Cent 
, | of Sample | ,, | of Sample 
Changed to: From: N =e N ae a 
ring ring 
Science Technology 26 54 8 19 
Arts 1 2 0 0 
Business Admin- 
istration 0 0 l 2 
27 56 9 21 
Social Science | Technology 11 19 2 3 
Science 19 32 8 11 
Arts 6 10 5 7 
Business Admin- 
istration 3 5 6 8 
Education 2 3 4 6 
Unspecified 1 2 0 0 
42 71 25 35 
Technology | Science 7 6 0 0 
Unspecified 2 2 0 0 
9 8 0 0 



































Achievements of Veterans and Non-Veterans 37 


nine per cent of all freshmen entered the School of Technology, 
whereas in the semesters beginning February, 1945, to February, 
1946, inclusive, only twenty-eight to thirty-four per cent of all 
freshmen entered this School. These two periods correspond 
closely to the years when the veterans entered and to those when 
the non-veterans entered, respectively. The veterans were 
apparently not influenced by the post-war trend away from 
technology. Table 3 shows that very few students transferred to 
the School of Technology, so that almost all of the students in this 
School at the time of this study began their college careers in it. 

In Table 4, the mean pre-service grade-point average of the 
veterans in the School of Technology is presented along with the 
mean grade-point average of non-veterans in the same School for 
two equivalent periods of time. Differences of .21 (for Group A) 
and .22 (for Group B) of a letter grade are found in favor of the 
non-veterans; both differences are significant at the one per 
cent level. In the post-service period, however, veterans in 
Group A exceeded the scholastic achievement of non-veterans by 
.13 of a letter grade (P = .03). On the other hand, veterans in 
Group B did about as well in the post-service period as non- 
veterans in an equivalent period. The mean high-school aver- 
age and ACE Psychological Examination score of veterans and 
non-veterans in the School of Technology (see Table 2) presaged 
poorer scholastic achievement on the part of veterans. Despite a 
slightly lower mean ‘scholastic aptitude,’ veterans equalled or 
exceeded the achievement of non-veterans. 

The data are not as clear-cut in the case of the other degree 
groups (Table 3). Fully seventy-one per cent of the veterans 
and thirty-five per cent of the non-veterans taking the social 
science course had changed to it from another course; fifty-six 
per cent of the veterans and twenty-one per cent of the non-vet- 
erans taking the science course had changed to it from another 
course. This means that there has been a wide range of subjects 
taken in the first year or two; the difference is especially marked 
between those who had taken an engineering course and those who 
had taken a non-engineering course. These differences must be 
borne in mind even when comparing veterans with non-veterans 
taking the same course. 

The veterans who were taking a science or social science course 
as of September, 1948, and who had taken a year or less of college 











The Journal of Educational Psychology 


38 


‘1B9A PITY} 9G} 1OJ ODBVIOAB VY} DZuturutsajyap Aq poutezqo sem potsod so1Asos-ysod 
B JO JuaTBAINba| ay} ‘siB9A OM} 4YSIY OY} OJ 9BBIDAV BY} Zututusajap Aq poutezqo sem potsod aotAsas-aid B Jo yuaTBAINba VF, ¢ 

‘sivak pily} PUB PUOdIS BY} IOJ 9BBIIVAB 9Yy} Zuturutsajap Aq pourezqgo sem potsod sotases-ysod B 
jo yuoTeatnba oy} favad 4ysIy OY} IOJ BBVA 9Y} Bururutsojap Aq paurezqo sem ported svotAses-a1d B Jo yuaTeAInba sy] ; 
‘S}Ipadd VdIAJoS-91d 910UI 10 9914}-AIIY} paye[duro0o 
psy oy osoy} Jo ‘gq dnowy {sytpasd adtAsos-o1d 0.M}-A}ATY} 0} WAAVTO Pazo[du0d pey OYM suUBIIZaA JO pasoduroOd st y dnoly ,; 






































Or | L2E° 8 16 (|ch | &F €1 | 29 (\19° | 22° tI | OF (OZT | CaMV) | “HOLT SUBIO}OA 
ce | «(OOF 9 FOI |IS' | 62’ Ol | 8& (Sb | oP L cg ¢L "YooT, | _SUBIOJOA-UON 
6Z° | SZ" 6 v6 «68h | C28 Il | 6b (\Sh | 6 9F \tF (a) | “Yooy, SUBIOZOA 
ce | OOF’ 9 FOIL ish | L2e° 8 IZ (|8c° | &F Gc zes«dEL ‘yoay, | ,SuUB10}0\-UON 
tF tPF’ lL 06 =| FR 0S’ L 69 (69° | 12° L IZ |9L (V) | “Qooay, SUBIO}O A 
OF | OF 9 $6 jec' | gg 91} 8¢ {29° | 20° Lt| ce (|6¢ |:(a9V)| SS SUBIO}VOA 
ce’ | $¢° c OO. |It' | G2’ Il | 8& |It' | 98° 6 29 OL SSqd | ,Suvssz9A-UON 
LY’ Ch’ 9 C609 cg: ZI | Lb (|2S° | 02° Ol | Sr {Ie (a) | SSd SUBIO}OA 
cg’ | gs¢° c OOT |Zg° 19° L OL |0S° | 62° ¢ og 22 SSq | ,SuBi0cj}9A-UON 
i 9F° P 16 |PP C9" lL IZ (|s2° | 80°- Ll 02 (8% (V¥) | SSsd SUBIO}IA 
68° | 8° 9 ZB OOS «| Sh ct | 29 |¢9° | OT’ LI | 8 (|8b | (a@¥V) Sd SUBIO}IOA 
9¢° TL’ c 86 |99° | O8° Or | 18 j|g¢° | 99° O1| 29 (\eF Sq | SuvsojoA-UON 
Lt | SF c t6 «©|99° | 9¢° €I | OF |It' | Se’ tI | Sh (|Z 1() Sd SUBIO}O A 
9¢° TL’ c 86 |69° | FL" L c9 «66g | 89° 9 $8 |ZP Sq | ;,SuBiezoA-UON 
8Z° | 08° c 06 (08° cP 9 49 WL’ | St'- 2 ZOMG (V) Sd SUBIIIOA 
WdBIDAY Ss}IpID WBBIVAY S}IPID WBVIDAY S}Ipsly 
4ul0, O *O} yulo 0 “ON: yuog 0 ‘ON 
G8 | Soaak | 28 | Seose | G8] Sommy | 28 | Score | 28) cea | 08 | Seon 
uBoj uBoyy uBoy N dnoy 
sjusjsAInby pus s}ua/sAInby pus 
potted 1890. went cnalig-seed SpoLlag aaenens 























SNVUGDLAA-NON GNV SNVUALAA UO SADVUAAY IVLO]T, GNV ‘SGOINag LNAIVAINOY YOd SNVUALAA 
-NON WOd SHDVATAY NVAYT ‘SNVUALAA WOX SADVUAAY AOLAUAG-LSOg GNV -dUg NVAI—'P WAV], 











Achievements of Veterans and Non-Veterans 39 


work in the pre-service period, had mean grade-point averages 
for this period of —.12 and —.08, respectively, (equivalent to a 
letter grade of slightly less than ‘C’). Non-veterans taking a 
science course obtained a mean grade-point average of .66 (B—) 
for an equivalent period, and those taking a social science course, 
a mean grade-point average of .29 (C+). In the case of the 
veteran group, the uncertainty of whether or not they would be 
drafted before the end of the semester, coupled with the enroll- 
ment of many of them in the Engineering School, when a large 
percentage probably lacked the aptitudes for technological train- 
ing, are suggested factors contributing to poor scholarship. On 
the other hand, those of the veteran group who were able to 
complete more than one year of school work in the pre-service 
period fared much better during this time with mean grade-point 
averages of .35 for the science and .20 for the social science groups. 
However, in an equivalent period, the non-veterans still surpassed 
the veterans with mean grade-point averages of .66 for the science 
and .36 for the social science groups. 

The post-service period gives an entirely different picture. 
Groups A and B of veterans taking a science or a social science 
course all showed increases in mean grade-point averages over the 
pre-service period. The non-veterans taking a science course 
were apparently a superior group from the beginning. Although 
both Groups A and B of the veterans taking a science course 
showed a marked improvement in the post-service period, they 
were not able to equal the consistently high performance of non- 
veterans. More than half of the veterans taking a science course 
had transferred from the School of Technology; these had 
started off with a poor adjustment to school work. This poor 
initial performance may have projected its effects to the time after 
the transfers had been made. Of veteran students taking a 
social science course, Group A obtained a mean grade-point 
average of .65 in the post-service period while non-veterans taking 
the same course obtained a mean grade-point average of .61. 
However, this slight difference in favor of the veterans is not 
statistically significant (P > .05). Group B also obtained a mean 
grade-point average of .65 in the post-service period but non- 
veterans obtained a mean of .75; the difference between the means 
is significant (P = .01). 

Group A of the veterans taking a technology course surpassed 











40 The Journal of Educational Psychology 


the B group in mean post-service grade-point average, although 
the mean pre-service grade-point averages of both groups had 
been very close. Group A, it may be recalled, is the group which 
had one year or less of college training before entering service. 
Therefore, for the technology students, the service experience 
appears to have had a beneficial effect on academic achievement. 
The comparison of Groups A and B of veterans taking the other 
courses is complicated by the variation in pre-service mean grade- 
point averages. Group A of veterans taking a science course 
showed a greater improvement in the post-service period over the 
pre-service period than did Group B. However, Group B still 
surpassed Group A in mean grade-point average. On the other 
hand, Group A of the veterans taking a social science course, 
whose academic achievement was less than that of Group B in the 
pre-service period, equalled Group B in the post-service period. 
The difference with respect to relative achievements of Groups A 
and B between veterans taking a science course and those taking 
a social science course may be explainable in terms of differences 
between the nature of the two courses. Success in advanced 
work in the science course is more dependent upon a good founda- 
tion in certain elementary courses than is the case in the social 
science course. In the social science course, the materials covered 
in the various subjects are not as interdependent. 


SUMMARY AND CONCLUSIONS 


The interpretation of many studies dealing with the relative 
scholastic achievement of veterans and non-veterans is limited 
because such factors as age and scholastic aptitude have not been 
adequately controlled or because tests of the significance of 
obtained differences have not been made. In general, it appears 
that the academic achievements of veterans were slightly superior 
to those of non-veterans.?:*!2:5 Although it was suggested that 
age alone was responsible, this thesis was not supported in fact. 
The service experience itself seems to have contributed to the 
better performance of veterans. The wide range of experiences, 
travelling, and constantly making adjustments, probably had 
their effect in enabling the veteran student to derive more from 
his education than the non-veteran. 

The present study includes four hundred fourteen male students 
taking science, social science, or technology courses. The pref- 











Achievements of Veterans and Non-Veterans 41 


erence of veterans for technological subjects, reported by Tibbets 
and Hunter,'* shows up in the present study also. This trend 
seems to be a carry-over from the war-time tendency toward these 


fields. 

In all instances, the mean post-service grade-point average for 
veterans was higher than their mean pre-service grade-point 
average. Furthermore, the veterans taking courses in technology 
exceeded their non-veteran classmates despite a lower mean 
high-school average, and a lower mean score on the Psychological 
Examination. However, veterans taking a science or social 
science course were not able to equal the consistently high per- 
formance of non-veterans. Factors that may contribute to this 
difference between students taking a technology course and those 
taking either a science or social science course are discussed. 


REFERENCES 


1) ‘‘ Academic Achievements of Veterans at Cornell University.” 
School and Society, txv (1947), 101-102. 

2) Edward L. Clark. ‘‘The Veteran asa College Freshman.’”’ School and 
Society, uxv1 (1947), 205-207. 

3) Paul E. Clark, and Bernard A. Staskiewicz. ‘‘Achievements of 
Veterans in General Chemistry.’’ School and Society, txv (1947), 482-484. 

4) Stephen E. Epler. ‘‘Do Veterans Make Better Grades than Non- 
Veterans?” School and Society, txv1 (1947), 270. 

5) Norman Garmezy and Jean M. Crose. ‘‘A Comparison of the Aca- 
demic Achievement of Matched Groups of Veteran and Non-Veteran Fresh- 
men at the University of Iowa.’ Journal of Educational Research, xu 
(1948), 547-550. 

6) Arthur M. Gowan, ‘‘Characteristics of Freshmen Veterans.” Journal 
of Higher Education, xx (1949), 205-206. 

7) M. G. Orr. ‘‘Grade-point Averages of Veterans at Oklahoma Agri- 
cultural and Mechanical College.”’ School and Society, txv1 (1947), 94. 

8) Frederick D. Pultz. ‘‘ Veterans in the Ohio State University College of 
Education.” Educational Research Bulletin, xxv1 (1947), 153-156. 

9) G. W. Read. ‘‘Scholastic Achievement of Veterans.” (Abstract) 
American Psychologist, 1 (1946), 452. 

10) Svend Riemer. ‘‘ Married Veterans are Good Students.”’ Marriage 
and Family Living, rx (1947), 11-12. 

11) Robert H. Schaffer. ‘‘A Note on the Alleged Scholastic Superiority of 
Veterans.”’ School and Society, xxvm (1948), 205. 

12) William B. Schrader and Norman Frederiksen. ‘‘The Comparative 
Achievement of Veteran and Non-Veteran Students in College.”” (Abstract) 
American Psychologist, 111 (1948), 259. 














42 The Journal of Educational Psychology 


13) Edgar A. Taylor. ‘‘How Well Are Veterans Doing?” School and 
Society, yxv (1947), 210-211. 

14) Clark Tibbets and Woodrow W. Hunter. ‘‘Veterans and Non- 
Veterans at the University of Michigan.” School and Society, uxv (1947), 
347-350. 

15) Ruth G. Weintraub and Ruth E. Salley. ‘‘Hunter College Reports 
on Its Veterans.”’ School and Society, uxvi11 (1948), 59-63. 

16) Ernest L. Welborn. ‘‘The Scholarship of Veterans Attending a 
Teachers College.’ Journal of Educational Research, xu (1946), 209-214. 





READABILITY OF CHILDREN’S TEXTBOOKS* 


EDMUND W. J. FAISON 
George Washington University 


The purpose of the present study is to compare the readability 
of the texts currently used in the fifth, sixth, seventh, and eighth 
grades of two school systems by application of the Flesch read- 
ability formulas. 

In 1948, Flesch* reported a revision of his readability formula. 
This readability yardstick has two indices: 1) the measure of 
‘Reading Ease,’ and 2) the measure of ‘Human Interest’ (here- 
after referred to as RE and HI, respectively). The RE formula 
involves the average sentence length in words and the average 
word length in sentences. The elements of the HI formula 
are the average percentages of ‘personal words’ and the average 
percentage of ‘personal sentences.’ Flesch developed this 
formula only after careful analysis of all of the previous attempts 
to measure readability, and it is by far the most objective and the 
easiest to apply of all readability measures. 

The application of this formula to children’s textbooks is 
appropriate. The scale offers educators a means of objectively 
placing texts on a scale of comprehensibility; and it will also 
point up to the authors of these texts the parts which need 
special attention when they are writing for a particular level. 
With this scale as a criterion the proper grade-placement of 
certain books and of certain types of material can be determined. 
In this respect, Ayer! found that, due to use of figurative language 
and abstract words and concepts, pupils with normal seventh- 
grade reading ability could answer only thirty-one per cent of the 
questions based on original fifth-grade histories used throughout 


the country. 
SAMPLING AND PROCEDURE 


The books of two city school systems in the vicinity of Wash- 
ington, D.C. were used in this study. In the schools of City A 
there are no assigned books for each grade. The teacher is 
permitted to select whatever books he wants to use. Therefore, 





* This is an abstract of an M.A. thesis done at George Washington Univer- 
sity. The author gratefully acknowledges the assistance of Dr. E. Lakin 
Phillips who helped direct the study. 

43 














44 The Journal of Educational Psychology 


every class may use different texts. A teacher may use one 
basic text and have several other texts for supplemental reading. 
This made the problem of sampling difficult. The books selected 
were the ones which the bookroom clerk estimated were called for 
more often than the others. In some cases, this made it possi- 
ble to use the books of a specific series in this study. 

In City B, standard texts were used in every grade. The texts 
used in this study were obtained from one elementary school for 
the fifth- and sixth-grade books, and from a junior high school for 
the seventh- and eighth-grade books in both cities. 

The fifth grade was selected as the lowest grade for the study, 
inasmuch as Flesch devised his formulas with the top value of 
100 representing the score of books suitable for one who has com- 
pleted the fourth grade (one who is barely functionally literate 
in terms of the United States Department of Census). 

The sampling of the sixth grade was made in order to ascertain 
what changes in reading ease and human interest take place in 
books used in the elementary schools. Books used in the seventh 
grade were sampled in order to learn what changes take place in 
reading ease and human interest in the jump from elementary 
schools to junior high schools; and the eighth grade, in order to 
determine if a greater difference in human interest and reading 
ease takes place in junior high school than it does in elementary 


schools. 
PROCEDURE IN RATING BOOKS 


The formulas for rating RE and HI are as follows: RE = 
206.835 — .846 wl — 1.015 sl (where wil equals the average 
word length in syllables per 100 words and s/ equals the average 
sentence length in words). HI = 3.365pw + .314ps (where pw 
equals the average per cent of personal words and ps equals the 
average per cent of personal sentences). 

A score of 100 with the RE formula corresponds to the predic- 
tion that a child who has completed the fourth grade will be able 
to answer correctly three-fourths of the test questions to be asked 
about the passage that is being rated. An HI score of 100 indi- 
cates that the passage has enough human interest to suit the 
reading habits and skills of one who has completed the fourth 
grade.® 

A total of thirty-eight books were rated. One book was 
selected for each school subject from each grade for each school 











Readability of Children’s Textbooks 45 


system. Thirty samples of one hundred words each were selected 
on a strictly numerical basis from each book. Each sample 
started at the beginning of a paragraph. The number of sylla- 
bles, number of sentences, number of personal words, and the 
number of personal sentences per each one hundred words were 
counted. 

The exact procedure recommended by Flesch was used in all 
cases with the exception of the mathematics texts. It was con- 
sidered advisable to apply a special procedure to them so that a 
comparison could be made with the other texts. 

A large portion of all mathematical books consisted of addition 
or subtraction problems such as: 


2735 
4624 
3552 
+6189 





It was felt that these numbers had to be considered and could not 
be skipped over as Flesch recommends. Also, if each horizontal 
row was considered as one four-place number, a large number of 
syllables (and consequent increase in difficulty) would occur which 
apparently is not the case. In this study, an addition or sub- 
traction problem which had to be completed by the reader was 
computed on the following basis: Every individual number 
instead of every horizontal row was counted as a word. This 
procedure was followed because every student reading these 
problems would read each vertical column one number at a time, 
as he actually performs the addition. Each problem was counted 
as a sentence since it represented a complete unit. This is a 
somewhat arbitrary procedure, but it appeared to be the most 
defensible one under the circumstances. 

In counting personal words all personal pronouns and nouns of 
a masculine or feminine gender were included. The group words 
‘folks’ and ‘people’ (with a plural verb) were also counted. 
Personal sentences were those which were spoken as indicated by 
quotation marks or otherwise; questions, commands, requests, 
etc. directly addressed to the reader; exclamations; and gram- 
matically incomplete sentences whose full meaning had to be 
inferred from the context. 











46 The Journal of Educational Psychology 


RESULTS 


In comparing school systems, (Table I) the texts of the two 
school systems revealed few large differences in ease of reading. 
The books from both systems showed a progressive decrease in 
reading ease from the fifth through the eighth grades, with the 
smallest difference between the seventh and eighth grades. 


TABLE I.—MEAN ScorEs OF READING EASE AND HuMAN INTER- 
EST OF ALL TExTs IN City A AND City B, Grapes FIVE 
THROUGH EIGHT 











City A City B 
Grade 
RE Score | HI Score | RE Score | HI Score 
5th 78.5 29.8 76.0 31.0 
6th 71.5 27.0 74.8 34.0 
7th 68.6 39.0 69.8 35.3 
8th 68 .6 32.3 68.8 33.6 
Mean 71.8 32.0 72.7 33.4 

















The reading ease scores are in the upper bracket of the range 
Flesch reports as ‘fairly easy’ in the fifth grade and steadily drop 
to the upper portion of what Flesch calls the ‘standard’ range in 
the seventh and eighth grades. 

Examination of the table shows that the HI scores of books of 
each school system reveal different patterns. These patterns, 
however, are difficult to analyze. Flesch defines a score of HI 
100 as the score which a book must obtain if it is to hold the 
interest of a person with a fourth-grade education. This has 
been interpreted to mean that as a person becomes better edu- 
cated, the reading material can gradually decrease in human 
interest and still hold the reader’s attention. 

All of the mean human interest scores for each grade are within 
the category which Flesch calls ‘interesting.’ The range of the 
scores is not great, but the patterns within the range are of inter- 
est. In City B, the fifth-grade texts were the least interesting. 
A gradual increase in human interest occurred up to the eighth 
grade, where it dropped to a level slightly below that of the sixth 
grade. Although the differences are small, they show a trend. 











Readability of Children’s Textbooks 47 


In City A, the sixth-grade texts were less interesting than those 
of the eighth grade. But the textbooks of the junior high school 
(seventh and eighth grades) were more interesting than those of 
the elementary school. 

With the exception of the seventh-grade score, the schools of 
City B used more interesting texts on the whole. The high 
seventh-grade human interest score (City A) was due to the very 
large increases in HI in science and literature from the sixth to 
the seventh grade. 


ANALYSIS OF RESULTS BY GRADES 


Results by grades are given in Table II. Although the average 
reading ease scores dropped consistently, many discrepancies 


TABLE II.—CoMPARISON OF MEAN READING EASE AND HUMAN 
INTEREST SCORES OF THE TEXTS OF FIVE SussJects TAUGHT 
IN Two SCHOOL SYSTEMS FROM THE FIFTH TO THE EIGHTH 
GRADES INCLUSIVELY 











5th 6th 7th 8th 
Subject — 7 
02 | RE! HI| RE! HI| RE! HI! RE! HI 
City A 75 | 50 | 70 | 34 
English City B] 76 | 33 | 77 | 45 | 77 | 52 | 83 | 57 


— City A| 85 | 51 | 71 | 29 | 68 | 34 | 60 | 23 
City B| 81 | 29 | 75 | 21 | 68 | 22 | 60 | 18 

sae City A| 71 | 30 | 69 | 26 | 55 | 19 | 58 | 19 
: City B| 69 | 21 | 78 | 30 | 58 | 34 | 62 | 20 

| _ | City A| 794] 29*| 74 | 39 | 74 | 55 | 80 | 57 
Reading or Lit. | Git B| g6 | 47 | 81 | 57 | 74 | 42 | 72 | 43 
ani City A] 79 | 09 | 72 | 14 | 74 | 37 | 75 | 28 
City B| 68*| 25*| 63 | 17 | 72 | 26 | 67 | 30 
































*In City A, Social Studies is taught instead of Reading in the fifth and 
sixth grades. In City B, Geography is taught rather than Science in the 
fifth and sixth grades. These subjects are listed for comparison. 


occurred among the individual subjects. In City B, English 
became easier to read as the grade increased. This result was 
produced mainly by the larger number of syllables in the lower 
books. 











48 The Journal of Educational Psychology 


In both cities the reading ease of mathematics increased from 
the seventh to the eighth grades. This increase was due to the 
change in style of arithmetic problems. Instead of listing 
columns of problems to be added or subtracted, the eighth grade 
books more often expressed the problems in a realistic fashion 
with the result that a little story was told about every problem 
situation and the effect of the actual numbers was minimized 
in figuring reading ease. 

This result was achieved primarily because of the difference in 
the subject matter taught. In the seventh grade common frac- 
tions, decimal fractions, integers, measurement, and percentages 
are the primary topics considered. In the eighth grade, banking, 
insurance, taxes, geometry, investments, and mensuration are 
taught. It can readily be seen that the topics in the eighth 
grade cannot have problems which are expressed in rows with 
instructions to perform the necessary operations. 

There are other inconsistencies which cannot be explained on 
any basis other than the style of the authors. The largest 
drop in reading ease occurred in mathematics between the sixth 
and seventh grades. Here the child takes only arithmetic in the 
elementary school, but when he begins junior high school he is 
confronted with the much broader and more difficult subject 
of mathematics which includes all of the topics already mentioned. 
This increase in difficulty of the subject is reflected in the reading 
ease score. 

It would seem that the factor of human interest has been 
practically ignored by the writers of all texts with the exception 
of the subject of mathematics. As is expressed in the preface of 
many of the mathematics books, there has been a conscious effort 
to state as many exercises as feasible in problem situations rather 
than just problems. These authors have realized that real 
learning takes place more rapidly when the child is projected into 
the problem situation. It is a real tribute to these authors that 
they have made their subject matter more interesting than science 
or history in some cases. 

Only the fifth grade science text of City A fell into the ‘dull’ 
category. It included few personal sentences or words. 


ANALYSIS OF RESULTS BY SCHOOL SUBJECTS 


The over-all trends are shown in Table III. It can be seen 
from this table that the most difficult subject was mathematics. 














Readability of Children’s Textbooks 49 


The average difficulty for all four grades was higher than the 
average eighth-grade score for all subjects. 


TABLE III.—CoMPARISON OF THE MEAN READING EASE SCORES 
AND MEAN HuMAN INTEREST SCORES OF THE TEXTS OF FIVE 
SUBJECTS FROM THE FIFTH THROUGH THE EIGHTH 
GRADES OF THE City A AND City B ScHoon 
SYSTEMS 


Math Science History English Literature 


City A 
RE 63 . 25 75.0 71.0 71.0 76.75 
HI 23.25 22.0 34.25 42.0 45.50 
City B 
RE 66.75 67.50 71.0 78.25 78.25 
HI 26.25 24.50 22.25 49.25 47.25 
Mean Score 
RE 65.0 71.25 71.0 74.63 77.50 
HI 24.75 23.25 28 . 25 45.63 46.37 


The procedure in rating mathematics books was different from 
that used for the others. The modification of the method recom- 
mended by Flesch was incorporated to increase the reading ease 
scores so that they could more readily be compared with the 
scores of the other texts. If the procedure Flesch recommends 
had been applied to the mathematics books, the scores would have 
been even lower on the reading ease scale. With either procedure, 
however, mathematics would still be considered the most difficult 
subject when using the reading ease score as a criterion. 

History, with a reading ease mean score of 71.0, was the second 
most difficult subject. This score according to Flesch’s norms 
would be on the border between ‘standard’ and ‘fairly easy.’ 

The history average was the same in both cities, but in City B, 
history was the third most difficult subject to read, with science 
preceding it; whereas in City A, history and English were tied for 
second and third (2.5) place in the rank order scale of difficulty. 

Science books had an over-all mean of 71.25. This merits 
third place on the scale, but the difference of .25 between this 
score and the mean history score is too small to be considered 
significant. 

In City B, science ranked second, while in City A, it ranked 
fourth in the order of reading difficulty. 











50 The Journal of Educational Psychology 


In the fifth and sixth grades in City B, geography is taught 
in place of science, and in this study geography was considered 
with the science group. 

English was the fourth of the five subjects on the composite 
mean rating scale. There was, however, a marked difference 
between the two school systems. English was tied for the second 
and third positions (actually position 2.5) with history in City 
A; and tied with literature for the fourth and fifth places (actually 
4.5) in City B. 

Literature was found to be the easiest subject with an average 
reading ease score of 77.5. This over-all average is at the mean 
of all the subjects of the fifth grade. 

Literature, as such, is not taught in the elementary schools. 
In City B, fifth- and sixth-grade readers are used. These readers 
contain light stories about the activities and accomplishments of 
children. In City A, the elementary schools have books which 
are entitled ‘Social Studies.’ These books are used as readers, 
but they contain stories which are more directly related to educa- 
tional topics. It is not surprising, therefore, to find that the 
City B elementary schools have higher reading ease scores for 
their readers than in City A. 

Mean human interest scores are also given in Table III. The 
mathematics textbooks had a human interest value of 24.75. On 
the rank-order scale which begins with the least interesting sub- 
ject, mathematics texts ranked second with a value 1.5 points 
higher than science. 

In City A, the human interest score of mathematics ranks 
exactly as it does in the composite mean. In City B, mathe- 
matics is third, falling behind history and science. 

History ranked third on the composite human interest scale 
with the score of 28.25. This agreed with the result obtained in 
City A; but in City B, history was the dullest subject in terms of 
human interest. This result was caused by the long sentences 
and few personal references. 

Science was the dullest subject in the composite scale in City A. 
This result was due mainly to the lack of personal words. It is 
very difficult to be personal about scientific facts unless an 
experiment is described by the worker who performed it. In 
City B, science was second with history preceding it by 2.25 


points. 











Readability of Children’s Textbooks 51 


In City A, literature was slightly higher than English, and 
was, therefore, fifth in that series and in the composite scale. 
But in City B, literature was slightly lower than English, but not 
enough to reverse the positions on the composite scale. The 
averages of both subjects fall in the group Flesch calls ‘highly 
interesting.’ ‘These two subjects are nearly 20 points higher than 
the other subjects in human interest. 


SUMMARY 


All books rated for reading ease fell within the range of Flesch’s 
categories ‘fairly difficult’ to ‘easy.’ The norms of each grade 
for both school systems showed a consistent decrease with the 
scores gradually leveling out between the seventh and eighth 
grades. The subjects were ranked in order of difficulty (reading 
ease score) and the following results were obtained: Mathematics 
(most difficult), history, science, English, literature (easiest). 

The mathematics average for all four grades was lower than 
the average of all subjects of the eighth grade. The literature 
average for all grades was approximately that of the average of 
all of the subjects of the fifth grade. 

No definite pattern was shown in human interest scores. All 
of the averages for the individual grades were in the ‘interesting’ 
range, however. The mathematics texts were the only ones in 
which a conscious attempt had been made by the author to 
personalize the material presented, but even there no system 
seemed to stand out. 

BIBLIOGRAPHY 


1) A.M. Ayer. Some Difficulties in Elementary School History. 
Contributions to Education. No. 212. New York: Bureau of 
Publications, Teachers College, Columbia University, 1926. 

2) J. N. Farr and J. J. Jenkins. ‘‘Tables for use with the 
Flesch readability formulas.” J. Appl. Psychol., 1949, 33, 275- 
278. 

3) R. Flesch. “A new readability yardstick.” J. Appl. 
Psychol., 1948, 32, 221-233. 











SELECTION OF STUDENTS FOR A TRADE AND 
INDUSTRIAL EDUCATION CURRICULUM 


H. 8. BELMAN AND R. N. EVANS 
Purdue University 


Numerous articles by educators have appeared, suggesting that 
it would be desirable to select students for teacher training in 
industrial education, rather than to accept all those who apply. 
However, few concrete proposals or examples have been presented. 

In 1942, Pawelek found that “in eighty-two per cent of the 
industrial arts departments, any student admitted to the uni- 
versity is automatically admitted to the department.’”’ When 
one hundred leading educators in the industrial arts field were 
polled, however, more than ninety-five per cent of them favored 
setting up definite admission standards.’ 

Land has been one of the few leaders in the field of industrial 
education who has reported on a comprehensive, objective 
method of selection. He states that prospective teachers of 
vocational industrial subjects at Pennsylvania State College are 
given a battery of tests which include ‘‘a four-hour trade per- 
formance test administered by a competent examiner, a three- 
hour trade theory examination, a psychological examination, an 
English test, and an interview by a competent committee of 
three.”’ This selection procedure is based upon a definite philoso- 
phy which might well be adopted elsewhere, for Land goes on to 
say: “‘The teacher-training institution which merely conducts 
courses for the preparation of teachers, without regard to their 
adequate selection and placement. . . is not meeting its responsi- 
bility.”"! However, concrete programs of this or any other type 
are the exception at present, and even Land makes no reference 
to studies of the validity and optimum weighting of these 
predictors. 

Traditionally, the selection of candidates for industrial teacher 
training has been done on the basis of physical appearance, 
method of shaking hands, and a perusal of assorted grades, tests, 
and recommendations. Little objective study has been devoted 
to the relevance of such predictors of success. The fact that a 
sizeable portion of the men selected in this fashion have been 


successful is an indication that it has produced some good results, 
52 











Selection for a Trade and Industrial Education 53 


but as competition increases in the field of industrial education, 
any method of selection which does not yield perfect results 
should be examined periodically. 


STATEMENT OF THE PROBLEM 


The Trade and Industrial Education curriculum, Division of 
Education and Applied Psychology, Purdue University, prepares 
students for one of three types of employment: 


1) Teaching industrial education subjects in the public sec- 
ondary schools. (a) vocational industrial. (b) indus- 
trial arts. 

2) Training within industry. 

3) School building service (superintendents of buildings and 
grounds). 


Students enter this curriculum from. two principal sources: 
approximately thirty per cent enter the curriculum as freshmen, 
and approximately seventy per cent transfer from other schools 
of the University. This study is concerned with the selection of 
the latter group from among those who apply. 

The transfer problem in any large university is a troublesome 
one. Many students enroll, for example, in engineering, believ- 
ing that it involves primarily manipulative work. They often 
persevere in the school of their original choice for three or four 
semesters, becoming progressively more maladjusted, unhappy, 
and earning progressively poorer scholastic records. Other stu- 
dents with no particular aptitude for college work also have 
difficulties with their present programs. Both types are among 
the applicants for transfer to trade and industrial education. 
Obviously, it is to the best interest of both the student and the 
school, that only those with aptitude for teaching in trade and 
industrial education be selected. 


DESIGN OF THE EXPERIMENT 


The present research is based upon an unpublished study by 
H. W. Porter and R. N. Evans‘ in which records of eight tests 
taken by twenty-seven graduates of the Trade and Industrial 











i” 





54 The Journal of Educational Psychology 


Education curriculum at Purdue were studied. Using the 
criterion of grade point index after transfer to the Trade and 
Industrial Education curriculum, the best individual predictor of 
the eight studied yielded a correlation of .67 with the criterion. 
This study indicated that students could be selected more effec- 
tively than had been possible in the past. 

The next problem was to secure more representative data 
covering a larger number of predictors. With this in mind, 
arrangements were made to administer a test battery made up of 
sixteen separate predictors to all World War II veterans and 
twenty-five per cent of the non-veterans in the T. & I. E. cur- 
riculum.* Participation in the testing program was voluntary, 
but ninety per cent of the eligible students took the entire battery 
of tests upon which these results are based. Ten additional 
predictors were obtained from orientation test scores and from 
student records, making a possible total of twenty-six test scores 
and other pertinent facts for each of the one hundred seven 
students who participated. 


MEASUREMENT TECHNIQUES AND RESULTS 


As in the previous study, the criterion of grade point index 
after transfer was correlated with each of the predictors, yielding 
the results shown in Table I. A measure of statistical signifi- 
cance was applied’ and all of the predictors which proved to be 
significantly non-zero at the one per cent level were selected for 
inclusion in a Wherry-Doolittle test selection problem.** This 
procedure provides a means by which tests are selected from a 
battery in the order in which they contribute to the prediction 
of the criterion, and at the same time eliminates tests which 
contribute nothing new to this prediction. A further result of 
this procedure is the determination of the optimum weighting for 
each test selected. 

For students who have been allowed to enroll in Trade and 
Industrial Education courses on a one-semester trial basis, the 
prediction sheet shown in Table II is used as a basis for deciding 





* These tests were administered and scored by the staff of the Veterans’ 
Guidance Center at Purdue. The cost of testing non-veterans is borne by 


the University. 








Selection for a Trade and Industrial Education 


TABLE I.—StrupyinG TRANSFER STUDENTS IN TRADE 


AND INDUSTRIAL EDUCATION 


Index after Transfer = Criterion 


Predictors 
Kuder Mechanical Interest 
Kuder Computational 
Kuder Scientific 
Kuder Persuasive 
Kuder Artistic 
Kuder Literary 
Kuder Musical 
Kuder Social Service 
Kuder Clerical 
Guilford-Martin O 
Guilford-Martin Ag 
Guilford-Martin Co 
How Supervise 
How I Teach 
Purdue Mechanical Adaptability 
Purdue Adaptability 
American Council on Education Psych. 
Purdue English 
Purdue Mathematics 
Purdue Physical Science 
Index before transfer 
Grade in Introductory Course to Trade and 
Industrial Education 
Previously dropped from University—yes or no 
Failed or passed Engineering Drawing 
Veteran or non-veteran 
Definitely chose option—yes or no 


Correlation 
with 
Criterion 

.0335 
— .0056 
.0437 
.0128 
.0529 
.0364 
. 1309 
. 1670 
.0059 
.0109 
. 1482 
. 1659 
. 2893 * 
.3154* 
. 1875 
.4120* 
. 2631 
.2995* 
.3441* 
. 2621 
.4982* 


.5381* 
. 2678 
.4368* 
.0806 
.4347* 


Number for which every predictor was available = 83 


* Indicates one per cent level of confidence that r is non-zero. 





55 


N 
107 
107 
107 
107 
107 
107 
107 
107 
107 
107 
107 
107 
107 
107 
106 
102 
107 

99 
100 

82 

95 


103 
107 
107 
107 
107 











56 The Journal of Educational Psychology 


whether or not they should continue. The predictors shown in 
the table are not adequate for the majority of transfer students, 
however, since they do not have the opportunity to enroll in 
Education 30 (Introduction to Trade and Industrial Education) 
before transferring. For this reason, the statistical computation 
was done a second time, omitting this predictor, which produced 
slightly different results. The multiple correlation coefficient for 
the predictors shown in Table II was .743, while the multiple 
correlation coefficient for the second method, which did not 
include the one predictor, was .731. 

It should be noted that these correlation coefficients, while far 
from perfect, are considerably higher than most of those obtained 
with test batteries now widely used in predicting success in 
industry, and in colleges, universities, and high schools. How- 
ever, at least a temporary upper limit has been approached, and 
it is unlikely that correlations much higher than about .75 will 
be obtained, until some method of increasing the reliability of 
grades has been devised and put into effect. 


TABLE II.—PREDICTION SHEET FOR USE WITH UNCLASSIFIED 


STUDENTS 
Grade in Education 30 X .2025 = 
Index Before Transfer X .3258 = 
Score on Purdue Adaptability X .0029 = 


(per centile, Purdue Seniors) 
Score on How Supervise, Form M 





(per centile, Level II Supervisors) X .0026 = 
Score on English Orientation 
Test (per centile, Purdue Freshmen) X .0026 = 
If This Student Has Definitely 
Chosen an Option, add +.3200 x 
Plus 1.4270 





Equals Expected Grade Point 
Average After Transfer 





As another way of showing the results obtained, Figure 1 has 
been prepared. The limit lines have no statistical significance, 
but are merely plus or minus one-half grade point, and show that 
if, for example, it is predicted that a student will make a 4.0 grade 











Selection for a Trade and Industrial Education 57 


PREDICTION OF SUCCESS 
IN TRADE AND INDUSTRIAL EODUCATIO 


PURDUE UNIVERSITY 
ACTUAL | 
GRADES 


£2 





+2 a¢ 3.6 28 40 42 4.4 4.6 4a@ 0 6=— 0 2 
PREDICTED GRADES 














58 The Journal of Educational Psychology 


index, it is quite unlikely that he will actually make below 3.5 
or above 4.5. 

The prediction methods described above are now in use in the 
Trade and Industrial Education curriculum at Purdue, but the 
process of further development of these selection techniques is 
still going on. New transfer students and the remainder of the 
non-veterans in the school are now taking a different battery of 
tests. The results achieved will be correlated at a later date 
with the grade point indices obtained by these students, and the 
whole selection procedure overhauled in the light of this new 
data. It is felt that only through a continuing process of 
experimentation can optimum results be obtained. 


CONCLUSIONS 


1) The scholastic success of students entering a _ teacher- 
training curriculum in industrial education can be predicted with 
a reasonable degree of accuracy when correct techniques are used. 

2) Such predictions can be used with considerable assurance in 
counselling students; and, when institutional policies permit, 
these predictions can form a very important part of the student 
screening and selection procedure. 


BIBLIOGRAPHY 


1) S. Lewis Land. ‘‘The Teacher Training Institution and 
Postwar Industrial Education,” Industrial Arts and Vocational 
Education, 35: 5-7, January, 1946. 

2) S. Pawelek. ‘‘Some Aspects of Industrial Arts Teacher 
Preparation.” Jndustrial Arts and Vocational Education, 31: 
147--9, April, 1942. 

3) C. C. Peters and W. R. Van Voorhis. Statistical Procedures 
and Their Mathematical Bases. New York: McGraw-Hill, 1940. 

4) H. W. Porterand R. N. Evans. ‘“ Predicting Success in the 
Trade and Industrial Education Curriculum.’’ (Unpublished.) 
Purdue University, 1949. 

5) W. H. Stead, C. L. Shartle, et al. Occupational Counseling 
Techniques. New York: American Book Co., 1940. 

6) J. Tiffin. Jndustrial Psychology. New York: Prentice- 
Hall, 1947. 











BOOK REVIEWS 


MatTTHEW N. CHAPPELL. In the Name of Common Sense: Worry 
and Its Control. New York: The Macmillan Company, 
1949, pp. 172. $2.75. 


The title of this book, In the Name of Common Sense, bears 
little relationship to the actual content; the sub-title, Worry and 
Its Control, indicates more accurately the nature of the material 
presented. 

The author seems not to have made up his mind whether he is 
writing for professional or lay readers. For example, it is diffi- 
cult to see what the average person would make out of the fol- 
lowing quotation: “Life, Pike states, is a function of four variable 
factors: (#), the environment; (PC), the physicochemical con- 
ditions of the body; (@), the glandular system; and (N), the 
nervous system. Mathematically, Pike’s formula is stated as 
follows: 


L = f((Z)(PC)(@)(N))”" 


On the other hand, such passages as the following hardly appeal 
to the psychologist or the psychiatrist: ‘‘The word ‘neurotic’ is 
a relic of a mistake made two or three hundred years ago. So 
are its cousins, ‘hysteria,’ ‘neurasthenia,’ and ‘psychasthenia.’ 
They are all mistakes of past centuries. Now they are nearly 
meaningless. They all mean the same thing: that one has 
developed outstanding skill in making himself emotional and in 
maintaining that skill. Anyone who tries to make them mean 
more is talking through his hat.’’? 

The book is confusing at many points. Chappell rightly con- 
demns the concept of will power, yet his recommendations for the 
control of worry sound surprisingly like the admonition to use 
will power and stop worrying. He urges his reader not to talk 
about his difficulties, asking him to make a resolution that he 
will discuss his troubles with no one except his physician. This 
point of view is not in keeping with modern therapeutic methods. 
It is rather generally accepted that catharsis is the first step in 
psychotherapy. One gets the impression from reading Chappell’s 
book that worriers should begin with positive action instead of 





'p. 163. 
2 pp. 87-88. 


59 














60 The Journal of Educational Psychology 


ending withit. There is little understanding here of the dynamics 
of anxiety. Insight is referred to but not emphasized. 

This reviewer sees but little in this book to recommend it. 
It is too technical for the layman and too superficial and opinion- 
ated for the professional reader. Moreover the point of view 
presented is frequently at variance with well established princi- 
ples of psychotherapy. HERBERT A. CARROLL 

University of New Hampshire 


A. ANASTASI AND J. P. Fouey, Jr. Differential Psychology. 
Revised Edition. New York: The Macmillan Co., 1949, 


pp. 894. 


This book is thoroughly revised and greatly enlarged. The 
revised edition is larger by two hundred seventy-four pages, and 
each page contains about one-fourth more words than the original 
edition. Whereas the first edition makes use of five hundred 
sixty end-of-chapter references, the revised edition presents nearly 
fifteen hundred. The revised edition contains four additional 
chapters, the material of which is largely new. These are “‘ Basic 
Concepts of Psychological Testing,’ ‘‘Biological Factors in 
Simple Behavior Development,” ‘‘Psychological Factors in 
Simple Behavior Development,”’ and ‘Schooling and Intelli- 
gence.”” Practically all the other chapters are expanded—some- 
times by the addition of new sections and sometimes by a com- 
plete rewriting. 

Style, purpose and general method of treatment of data as 
found in the original edition are largely preserved. There is the 
same or perhaps slightly greater emphasis upon methodological 
problems and interpretation. Certainly both editions amount to 
source books in their field, but much of their significance lies in 
the fact that they present the student with a coherent, significant 
and meaningful account of the data. As such, the revised edi- 
tion, like the first, should make a useful textbook. It should fill 
the needs of collateral reading for several courses in psychology. 
It is a significant contribution. J. B. Stroup 

State University of Iowa 


JEAN PIAGET AND BARBEL INHELDER. Le Développement des 
Quantités chez Enfant. Conservation et Atomisme. Paris, 











Book Reviews 61 


7e, rue de Grenelle: Editions Delachaux & Niestlé S. A., 
pp. 344. 


The specific processes of maturation and of learning through 
which children’s early understandings are formed constitute a 
subject of great interest to psychologists and teachers. Cer- 
tainly, all too little is known in this area at present. Piaget and 
his collaborator contribute here a significant report of research 
in the formation of quantitative concepts by children, with three 
principal problems involving experimental induction as the cen- 
ters of attention. 

The first problem attacked by Piaget and Inhelder is this: By 
what sequential development and at what age may ideas of the 
permanence of substance, of weight and of volume become under- 
stood by children? The technique used was as follows: The child 
was given a little ball of clay and was directed to make another 
just like it, ‘‘the same size and the same weight.’’ Then one of 
the balls was altered in shape, either by lengthening it to the 
form of a sausage, or by flattening it like a pancake, or by some 
other method. Thereupon, the child was asked whether the two 
balls still had the same weight, the same quantity of material, 
and the same volume. He was also urged to explain his answers. 

Four important stages were observed in the successive appear- 
ance of concepts inherent in this first problem: 

1) Before the age of seven or eight, the child understands 
neither the permanence of the substance, nor of the weight, nor 
of the volume of the deformed ball of clay. 

2) At the approximate age of eight to ten, the child compre- 
hends that the ball is of the same substance, but not that it has 
the same weight and volume as before. 

3) From ten to eleven or twelve, he understands that it is of 
the same substance and weight, but not yet that it has the same 
volume. 

4) Beginning with age eleven or twelve, the child is able to 
recognize simultaneously the three forms of permanence of the 
deformed ball of clay, with a tendency to reduce the idea of sub- 
stance to those of weight and of volume. 

The second aspect of this research also concerns itself with 
children’s ideas regarding the permanence of substance, and, 
more specifically, with what becomes of sugar when it is dissolved 














62 The Journal of Educational Psychology 


in water. Piaget and Inhelder, working with more than one 
hundred children from four to twelve, found indications of the 
same stages in development of understandings as those observed 
in the experiment with the small balls of clay. At first, the child 
believes that the substance of the sugar, when dissolved, dimin- 
ishes to nothing. Later, he understands that the substance is 
maintained, but thinks that the sugar’s weight and volume are 
lost. Then he adds the idea of the invariability of weight and, 
finally, of the invariability of volume. 

A third area of experimentation reported in this volume is 
related to children’s concepts of the nature of substance. Such 
materials as a cork, a pebble (smaller but heavier than the cork), 
and a piece of wood (of a size intermediate between the two, 
heavier than the cork, but lighter than the pebble) were pre- 
sented to the children and made the subjects of questioning. 
Four stages of conceptual growth are recorded, indicating progress 
from the belief that a body is heavy in proportion to its size, to an 
understanding of density, dilation and contraction. 

The four stages of development which the investigators found 
in each of these experiments may seem to the reader to be a 
rather too highly systematized means of reporting the findings. 
The important conclusions, perhaps, are contained in the observa- 
tion of a consistent sequential development of the concepts, with a 
similar pattern in each experiment. Such conclusions may have 
valuable implications in terms of curriculum planning and the 
direction of learning activities in the elementary school. 

Any experimental work is commendable which adds to our 
knowledge of how children change through maturation and 
through perceptual experiences, at what rate and through what 
sequences they arrive at insightful response to the innumerable 
problematic situations where they find themselves. Piaget has, 
in this volume, added notably to his already famous contributions 
in the study of children’s concepts. He and his associate have 
suggested lines of experimentation which may continue to 
increase in important ways our understanding of children. 

Wiis N. Potrer 


College of the Pacific 


Saut RosENZWEIG wiTH Kate L. KoGan. Psychodiagnosis. 
New York: Grune and Stratton, 1949, pp. 380. 











Book Reviews 63 


As a sequel to F. L. Wells’ Mental Tests and Clinical Practice 
and Murray’s Explorations in Personality, this book explains the 
purpose, materials, obtained data, scoring methods, and interpre- 
tation of a representative group of psychological tests as they 
are employed in up-to-date psychodiagnosis. 

This treatment, addressed to beginning students of professional 
psychology and those in the allied fields of medicine, nursing, 
social work, etc., is in untechnical language; it excludes standard- 
ization procedures and gives little emphasis to theoretical 
considerations. 

A brief discussion of clinical psychology as a diagnostic art is 
followed by several chapters dealing with tests of general intelli- 
gence, measures of intellectual deviation, tests of vocational apti- 
tude and interest, and projective techniques. The process of 
psychodiagnostic integration is presented in terms of concrete 
case history examples, illustrating clinical procedure in the 
synthesis of information obtained from the separate tests and 
interviews which results in the emergence of a final picture of the 
personality. 

In a retrospective and projected consideration of psycho- 
diagnosis as a science, the author traces the development of the 
theories of the academic psychologist and the practices of the 
clinical psychologist as relatively independent movements. A 
shift in the center of gravity in psychology from emphasis on the 
generalized average man to the structure and organization of the 
individual person is recommended as a common effort in which 
the academic and clinical psychologist can unite for scientific 
advance. 

This explicative but untechnical presentation of psychodiag- 
nostic procedures supplies an important link in the development 
of a closer liaison between clinical psychologists and professionals 
in allied fields. The effectiveness of this book as a text book for 
beginning students in clinical psychology is fortified by the 
practical presentation of the subject and the interesting and 
illustrative case material. 

The section on vocational testing is a little disappointing. As 
the author recognizes, the selection of tests is circumscribed and 
consequently not very representative; furthermore, the reader is 
left with the impression that vocational recommendations emerge 
chiefly, if not solely, from the test results. At least a brief 











to ll 





64 The Journal of Educational Psychology 


mention of client-centered vocational counseling as something 

more than test administration and interpretation would have 

been helpful. WILuiAM F. HoLMEs 
The Psychological Corporation 


LorRAINE A. Dani. Public School Audiometry. Danville, IIl.: 
The Interstate Printers and Publishers, 1949, pp. 290. 


Up to the present the measurement of hearing status among 
school children has not received the attention nor the support 
that it deserves. This is partly due to inconsistent techniques 
and standards in the various surveys and the programs for con- 
servation of hearing. Nevertheless, it is well established that 
there are a large number of children of school age who have seri- 
ous hearing impairments. For purposes of medical care, and 
for aid in educational and personality adjustment, it is important 
that these children be identified. The material in the present 
book should both promote and facilitate more adequate measure- 
ment of hearing status. 

This book is both a text and a field manual. Part I, on factors 
to be considered in a hearing conservation program, deals with 
evidence of the problem, factors hindering the program of hear- 
ing conservation, and socio-economic problems of the hard-of- 
hearing child. The next part considers the techniques for 
creating interest in the conservation of hearing. Part III pre- 
sents in complete and clear detail the techniques and principles 
of audiometry for the discrete frequency audiometer and for 
group audiometry. The appendixes include material on supplies, 
testing schedules, and forms of reports. There is an extensive 
bibliography. 

This book presents useful information for all those interested 
in the hearing problems of school children. No school nurse or 
other person concerned with audiometry can afford to be without 
it. Miurs A. TINKER 

University of Minnesota 











(Concluded from Inside Front Cover) 


something like uniformity of style, footnotes and bibliographical 
references frequently require considerable ‘marking for the 
printer,’ and this is difficult—if not impossible—where matter is 
single-spaced. 

Manuscripts should not be marked for style—this is done in 
the editorial office. In the matter of style the hewn has certain 
set rules and also some marked preferences. For example, no bold- 
face type is used; the use of italics is restricted to titles of books 
and periodicals, foreign words, and subheads. As to preferences, 
in both references and bibliographies the Journat prefers ‘John 
Brown’ to ‘Brown, John.’ Footnotes and bibliographical references 
are set as paragraphs with the first line indented and the following 
lines flush, rather than the first line flush and the subsequent 
lines indented. 





The JourNnat is published monthly from October to May— 
eight issues to the volume, with the volumes running parallel with 
the calendar year. 

The price per year in the United States, Pan-American coun- 
tries, and the Philippines is $6.00; in Canada, $6.20; in other 
foreign countries, $6.40. Single issues of the current year are $1.10 
each, plus 2 cents postage to Canada and 5 cents to other foreign 
countries. 

Subscribers should notify the Publishers of change of address 
at least four weeks in advance of publication of the issue with 
which change is to take effect, and both the old and new address 
should be given. Claims for non-receipt of an issue should be made 
within two weeks after the receipt of the next succeeding number. 


WARWICK AND YORK Pubdlishers BALTIMORE 2, Mb. 








